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ABSTRACT 

We describe a Bayesian approach to estimating quasar black hole mass functions (BHMF) when 
using the broad emission lines to estimate black hole mass. We show how using the broad line mass 
estimates in combination with statistical techniques developed for luminosity function estimation 
(e.g., the 1/V a correction) leads to statistically biased results. We derive the likelihood function for 
the BHMF based on the broad line mass estimates, and derive the posterior distribution for the 
BHMF, given the observed data. We develop our statistical approach for a flexible model where 
the BHMF is modelled as a mixture of Gaussian functions. Statistical inference is performed using 
markov chain monte carlo (MCMC) methods, and we describe a Metropolis-Hasting algorithm to 
perform the MCMC. The MCMC simulates random draws from the probability distribution of the 
BHMF parameters, given the data, and we use a simulated data set to show how these random draws 
may be used to estimate the probability distribution for the BHMF. In addition, we show how the 
MCMC output may be used to estimate the probability distribution of any quantities derived from 
the BHMF, such as the peak in the space density of quasars. Our method has the advantage that 
it is able to constrain the BHMF even beyond the survey detection limits at the adopted confidence 
level, accounts for measurement errors and the intrinsic uncertainty in broad line mass estimates, and 
provides a natural way of estimating the probability distribution of any quantities derived from the 
BHMF. We conclude by using our method to estimate the local active BHMF using the z < 0.5 Bright 
Quasar Survey sources. At z ~ 0.2, the quasar BHMF falls off approximately as a power law with 
slope ~ 2 for Mbh ^ 1O 8 M . Our analysis implies that at a given Mbh, z < 0.5 broad line quasars 
have a typical Eddington ratio of ~ 0.4 and a dispersion in Eddington ratio of < 0.5 dex. 
Subject headings: galaxies: active — galaxies: mass function — galaxies: statistics — methods: data 
analysis — methods: numerical — methods: statistical 



1. INTRODUCTION 

It is widely accepted that the extraordinary activity associated with quasars 1 involves accretion onto a su- 
permassive black hole (SMBH). The correlation between SMBH mass and both host galaxy l uminosity (e.g., 
iKormendv fe Richstond 119951 : iMagorrian et alj|1998t iMcLure fc Dunlodl2001t iMarconi fc Huntll2003l) and stella r ve- 
locity dispersion (Mbh~g relationship. e.g. JGebhardt et al.ll2000l : lMerritt fe Ferra rcsc 2001; Trcma ine et al.ll2002l), to- 
gethe r with the fact that quasar s have been observed to reside in early-type galaxies (IMcLure et alJll999t iKukula et al.l 
120011: iMcLeod fc McLeodl200lt iNolan et al.ll2001fc iPercival et aLll2001fc [Punlop et al.ll2003f). implies that the evo lution 
of spheroidal ga laxies and quasars is in tricat ely tied together (e.g ., ISilk fc Reed 119981 : lHaehnelt fc Kauffmannl 120001 : 
iMerritt fc Poonll2004t fPi Matteo et~aT1 120051 : iHopkins et all 12009 ). Therefore, investigating the evolution of active 
super-massive black holes (SMBHs) is an important task of modern astronomy, giving insight into the importance of 
AGN activity on the formation of structure in the universe. Determination of the comoving number density, energy 
density, and mass density of active black holes is a powerful probe of the quasar-galaxy connection and the evolution 
of active black holes. 

Recently, advances in reverberation mapping (e.g. JPeterson et al"1l2004D have made it possible to estimate the masses 
of black holes for broad line AGN. A correlation has been found betwee n the size of the region emitting the broad lines 
and the luminosity of the AGN (jKaspi et al.l 120051 : iBentz et alJ l2006t l , allowing one to use the source luminosity to 
estimate the distance between the broad line region (BLR) and the central black hole. In addition, one can estimate 
the velocity dispersion of the BLR gas from the broad emission line width. One then combines the BL R size estimate 
with the velocity estimate to obtain a virial black hole mass as Mbh oc L b V \ where b ~ 1/2 (e.g.. IWandel et all 
119991: IMcLure fc Jarvisl 120021 : IVestergaardl l2002t IVestergaard fc Peterson] l2006ft . Estimates of M BH obtained from 
the broad em ission lines have been used to estimate the distribution of quasar black hole masses at a variety of 
redshifts (e.g- lMcLure fc Dunlop|[2004 IVestergaardll2004 Ikollmeier et aIll2006HWang et alJl200llGreene fc Holl2007t 
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IVestergaard et alj|2008t iFine et al.ll2008f) . 

Given the importance of the BHMF as an observational constraint on models of quasar evolution, it is essential 
that a statistically accurate approach be employed when estimating the BHMF. However, the existence of compli- 
cated selection functions hinders this. A variety of methods have been used to accurately account fo r the selection 
function when estimating the quasar lumino sity function. These include variou s binning methods (e .g., Schmidt 1 19681 : 
lAvni fc Bahcal]|ll980HPage fc Carrerall2000tl . maximum-likelihood fitting (e.g. iMarshall et alJll983t iFan et al.ll2001h 
a semi-parameteric approach ( Schafer!l2007f ). and Bayesian approaches (e.g.. lAndreon et al.l 120051 : iKellv et al.l 120081 . 
hereafter KFV08). In addition, there have be en a variety of methods proposed for estimating the cumulative distribu - 
tion function of the luminosity function fe.g.. lLvnden-Beiilll971l : lEfron fc Petrosia n 1992; Mal onev fc P ctrosian 1999). 
While these techniques have been effective for estimating luminosity functions, estimating the BHMF from the broad 
line mass estimates is a more difficult problem, and currently there does not exist a statistically correct method of 
estimating the BHMF. 

If we could directly measure black hole mass for quasars, and if the selection function only depended on Mbh and 
z, then we could simply employ the formalism developed for luminosity function estimation, after replacing L with 
Mbh- However, surveys are selected based on luminosity and redshift, not on Mbh- At any given luminosity there 
exists a range in black hole mass, and thus one cannot simply employ the luminosity selection function 'as-is' to correct 
for the flux limit. In other words, completeness in flux is not the same thing as completeness in Mbh, and the use of 
a flux selection results in a softer selection function for Mbh- Moreover, we cannot directly observe Mbh for large 
samples of quasars, but rather derive an estimate of Mbh fro m their broad emission lines. The intrinsic uncertainty 
on Mbh derived from the broad emission lines is ~ 0.4 dex (IVestergaard fc Petersonll2006fh and t he uncertainty o n 
Mbh broadens the inferred distribution of M B h (e.g.. IKellv fc Bechtoldl l2007t IShen et all [20071: IFine et~aH 120081) . 
As a result, even if there is no flux limit, the BHMF inferred directly from the broad line mass estimates will be 
systematically underestimated near the peak and overestimated in the tails. In order to ensure an accurate estimate 
of the BHMF it is important to correct for the uncertainty in the estimates of Mbh ■ 

Motivated by these issues, we have developed a Bayesian method for estimating the BHMF. In KFV08 we derived 
the likelihood function and posterior probability distribution for luminosity function estimation, and we described a 
mixture of Gaussian functions model for the luminosity function. In this work, we extend our statistical method and 
derive the likelihood function of the BHMF by relating the observed data to the true BHMF, and derive the posterior 
probability distribution of the BHMF parameters, given the observed data. While the likelihood function and posterior 
are valid for any parameteric form, we focus on a flexible parameteric model where the BHMF is modeled as a sum 
of Gaussian functions. This is a type of 'non-parameteric' approach, where the basic idea is that the individual 
Gaussian functions do not have any physical meaning, but that given enough Gaussian functions one can obtain a 
suitably accurate approximation to the true BHMF. Modeling the BHMF as a mixture of normals avoids the problem 
of choosing a particular parameteric form, especially in the absence of any guidance from astrophysical theory. In 
addition, we describe a markov chain monte carlo (MCMC) algorithm for obtaining random draws from the posterior 
distribution. These random draws allow one to estimate the posterior distribution for the BHMF, as well as any 
quantities derived from it. The MCMC method therefore allows a straight-forward method of calculating errors on 
any quantity derived from the BHMF. Because the Bayesian approach is valid for any sample size, one is able to place 
reliable constraints on the BHMF and related quantities, even where the survey becomes incomplete. 

Because of the diversity and mathematical complexity of some parts of this paper, we summarize the main results 
here. We do this so that the reader who is only interested in specific aspects of this paper can conveniently consult 
the sections of interest. 

• In § 12.21 we derive the general form of the likelihood function for black hole mass function estimation based 
on quasar broad emission lines. Because we can not directly observe Mbh for a large sample of quasars, the 
likelihood function gives the probability of observing a set of redshifts, luminosities, and line widths, given an 
assumed BHMF. In § 12.31 we derive the black hole mass selection function, and discuss how the differences 
between the Mbh selection function and the luminosity selection function affect estimating the BHMF. The 
reader who is interested in the likelihood function of the broad line quasar BHMF, or issues regarding correcting 
for incompleteness in Mbh, should consult this section. 

• In § El we describe a Bayesian approach to black hole mass function estimation. We build on the likelihood 
function derived in § 12.21 to derive the probability distribution of the BHMF, given the observed data (i.e., the 
posterior distribution) . The reader who is interested in a Bayesian approach to BHMF estimation should consult 
this section. 

• In §E]we develop a mixture of Gaussian functions model for the black hole mass function, deriving the likelihood 
function and posterior distribution for this model. Under this model, the BHMF is modelled as a weighted sum 
of Gaussian functions. This model has the advantage that, given a suitably large enough number of Gaussian 
functions, it is flexible enough to give an accurate estimate of any smooth and continuous BHMF. This allows 
the model to adapt to the true BHMF, thus minimizing the bias that can result when assuming a parameteric 
form for the BHMF. In addition, we also describe our statistical model for the distribution of luminosities at a 
given Mbh, and the distribution of line widths at a given L and Mbh- These two distribution are necessary in 
order to link the BHMF to the observed set of luminosities and line widths. The reader who are interested in 
employing our mixture of Gaussian functions model should consult this section. 
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• Because of the large number of parameters associated with black hole mass function estimation, Bayesian infer- 
ence is most easily performed by obtaining random draws of the BHMF from the posterior distribution. In §[5] we 
describe a Metropolis-Hastings algorithm (MHA) for obtaining random draws of the BHMF from the posterior 
distribution, assuming our mixture of Gaussian functions model. The reader who is interested in the compu- 
tational aspects of 'fitting' the mixture of Gaussian functions model, or who is interested in the computational 
aspects of Bayesian inference for the BHMF, should consult this section. 

• In § [5] we use simulation to illustrate the effectiveness of our Bayesian Gaussian mixture model for black hole 
mass func tion estimation. We c onstruct a simulated data set similar to the Sloan Digital Sky Survey DR3 Quasar 
Cataloge ([Schneider et alJ2005f >. We then use our mixture of Gaussian functions model to recover the true BHMF 
and show that our mixture model is able to place reliable constraints on the BHMF over all values of Mbh- 
In constrast, we show that estimating the BHMF by binning up the broad line mass estimates, and applying a 
simple 1/V a correction, systematically biases the inferred BHMF toward larger Mbh- We also illustrate how to 
use the MHA output to constrain any quantity derived from the BHMF, and how to use the MHA output to 
assess the quality of the fit. Finally, we discuss difficulties associated with inferring the distribution of Eddington 
ratios. The reader who is interested in assessing the effectiveness of our statistical approach, or who is interested 
in using the MHA output for statistical inference on the BHMF, should consult this section. 

• In § [7| we use our statistical method to estimate the z < 0.5 BHMF from the Bright Quasar Survey sources. We 
also attempt to infer the mean and dispersion in the z < 0.5 distribution of Eddington ratios. The reader who 
is interested in the scientific results regarding our estimated z < 0.5 BHMF should consult this section. 

We adopt a cosmology based on the the WMAP best-fit parameters (h = 0.71, Q m — 0.27, Q\ — 0.73. lSpergel et all 
[200l 

2. THE LIKELIHOOD FUNCTION 
2.1. NOTATION 

We use the common statistical notation that an estimate of a quantity is denoted by placing a 'hat' above it; e.g., 
6 is an estimate of the true value of the parameter 6. We denote a normal distribution with mean fi and variance a 2 
as N(n, a 2 ), and we denote as N p {jji, S) a multivariate normal distribution with p-element mean vector \i and p x p 
covariance matrix S. If we want to explicitly identify the argument of the Gaussian function, we use the notation 
N(x\fx, a 2 ), which should be understood to be a Gaussian function with mean [i and variance a 2 as a function of x. We 
will often use the common statistical notation where "~" means "is drawn from" or "is distributed as" . This should 
not be confused with the common usage of "~" implying "similar to". For example, x ~ iV(yu, a 2 ) states that x is 
drawn from a normal distribution with mean /i and variance a 2 , whereas x ~ 1 states that the value of x is similar to 
one. 

2.2. Likelihood Function for the BHMF Estimated from AGN Broad Emission Lines 

The black hole mass function, denoted as <P(Mbh, z)dMsH, is the number of sources per comoving volume V(z) 
with black hole masses in the range Mbh, Mbh + dMsn- The black hole mass function is related to the probability 
distribution of (Mbh,z) by 

1 , ,dV 

p(M BH ,z) = —(t>(M BH ,z) — , (1) 

where N is the total number of sources in the universe, and is given by the integral of <f> over Mbh and V(z). If we 
assume a parameteric form for <P{Mbh, z), with parameters we can derive the likelihood function for the observed 
data. The likelihood function is the probability of observing one's data, given the assumed model. The presense of 
selection effects and intrinsic uncertainty in the broad line mass estimates can make this difficult, as the observed data 
likelihood function is not simply given by Equation |T]). However, we can account for these difficulties by first deriving 
the likelihood function for the complete set of data, and then integrating over the missing data to obtain the observed 
data likelihood function. 

For broad line AGNs, we can relate the distribution of Mbh and z to the joint distribution of L\,v, and z. Here, 
v = {vHf3,VMgii,vav), where Vh/3 = Vh@ is the the velocity dispersion for the H/3 broad line emitting gas, and 
similarly for vugii and vqiv- These three line s are commonly u s ed in estimating Mbh from single-epoch spectr a 
of broad line AGN (e.g., iMcLure fc Jarvisl 120021 iKaspi et al1l2005t IVestergaardll2002t IVestergaard fc Peterson! l2006f ). 
where the velocity dispersion is typically estimated from the FWHM of the emission line. The distribution of L\ and 
v are then related to the BHMF via the R-L relationship and the virial theorem. 

The BHMF for broad line AGN can be inferred from the distribution of L\,v, and z, and thus it is necessary to 
formulate the observed data likelihood function in terms of (L\, v, z). While it is possible to formulate the likelihood 

function in terms of the broad line mass estimates, denoted as Mbl <x l]/ 2 V 2 , the logarithm of the broad line 
mass estimates are simply linear combinations of log La and logv, and thus statistical inference does not depend on 
whether we formulate the likelihood function in terms of L\ and v or Mbl- We find it mathematically simpler and 
more intuitive to infer the BHMF directly from the distribution of L\,v, and z, as opposed to inferring it from the 
distribution of L\, Mbl, and z. 
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Following the discussion in KFV08, we derive the likelihood function for the set of observed luminosities, redshifts, 
and emission line widths. We introduce an indicator variable / denoting whether a source is included in the survey or 
not: if Jj = 1 then a source is included, otherwise, Ii = 0. The variable / is considered to be part of the observed data 
in the sense that we 'observe' whether a source is detected or not. The survey selection function is the probability of 
including the i th source in one's survey, p(Ii = l|vj, Lx,i, z\). Here, we have assumed that the probability of including 
a source in one's sample only depends on luminosity, redshift, and emission line width, and is therefore conditionally 
independent of Mbh- This is the case, in general, since one can only select a survey based on quantities that are 
directly observable. Including the additional 'data' /, the observed data likelihood function for broad line AGN is: 

p(v obs ,L obs ,z obs ,I\9, N) oc (2) 
C n II I P(vi,L x ,i,M B H,i,Zi\6) dM BH ,i (3) 



\\ III Pi 1 = \ v ji L Xji z j)p( v j> L \j> M BHj,Zj\d) dvj dL X jdM B H,j dzj (4) 



ieA a 

X 

j&Amis 

cxC[p(J = 0|^)] iV - n J] p{w u L x>uZi \6), (5) 
ieAobs 

where A obs denotes the set of sources included in one's survey, A m is denotes the set of sources not included in one's 
survey, and on the last line we have omitted terms that do not depend on TV or 9. Here, 

/>OC 

p{\i,L\^Zi\6) = / p(-v i ,Lx,i-,z i ,M B H,i\0) dM B H.i (6) 
Jo 

is the probability of observing values of v^, Lx,i, and Zi for the i th source, given 9, and 

/>oo />oo />oo 

P (I = 0\6)= / / p(I = 0\v,L x ,z)p(v,Lx,z\9) dv dL x dz (7) 
Jo Jo Jo 

is the probability that the survey misses a source, given 9; note that p(I = 0\9) = 1 — p(I = 1|#). Qualitatively, 
the observed data likelihood function for the BHMF is the probability of observing a set of n emission line widths 
vi, . . . , v„, luminosities Lx,i, ■ ■ ■ , Lx, n , and redshifts z%, . . . , z n given the assumed BHMF model parameterized by 9, 
multiplied by the probability of not detecting N — n sources given 9, multiplied by the number of ways to select a subset 
of n sources from a set of N total sources. Equation (0 can be maximized to calculate a maximum likelihood estimate 
of the black hole mass function when using broad line estimates of M BB , or combined with a prior distribution to 
perform Bayesian inference. 

It is often preferred to write the BHMF observed data likelihood function by factoring the joint distribution of 
v, Lx, M BB , and z into conditional distributions. This has the advantage of being easier to interpret and work with, 
especially when attempting to connect the distri bution of line widths and luminosities to the distribution of black hole 
mass. The joint distribution can be factored as (jKellv &: Be chtold 20071 ) 

p(v, Lx, M BH , z) = p(v\L x ,M BH , z)p(L x \M BH , z)p{M BH , z). (8) 

Here, p(v\Lx, M BB , z) is the distribution of emission line widths at a given Lx,M BB , and z, p(Lx\M BB , z) is the 
distribution of luminosities at a given M BB and z, and p(M BB , z) is the probability distribution of black hole mass 
and redshift, related to the BHMF via Equation ([1]). When using broad line estimates of M BB , it is assumed that 
p(v\Lx, M BB , z) is set by the virial theorem, where the distance between the central black hole and the broad line- 
emitting gas depe nds on Lx via the R-L relationship. In this work we assume that the R-L relationship does not 
depend on z (e.g.. IVesterga ard 2004]), and thus p(v\Lx, M BB , z) — p(y\L x , M BB )- 
Under the factorization given by Equation {8j), the observed data likelihood function (Eq. [5]) becomes 

p(-v obs ,L obs , z obs ,I\9, N) cx 

C^\p(I = 0\9)f- n J] / p(vi\Lx,i,M B H,i,0)p(LxAMBB,i,z,0)p(M BHiit z\6)dM B H,i. (9) 
ieA obs Jo 

The BHMF likelihood function, given by Equation ([5]) or ([9]), is entirely general, and it is necessary to assume 
parametric forms in order to make use of it. In § [4] we describe a parametric form based on a mixture of Gaussian 
functions model, and explicitly calculate Equation ([5]) for the mixture model. 

2.3. Selection Function 

The selection probability, p{I = l|v, Lx,z), depends on both the luminosity and redshift through the usual flux 
dependence, but can also depend on the emission line width. In particular, an upper limit on v may occur if there is a 
width above which emission lines become difficult to distinguish from the continuum and iron emission. In this case, 
if all emission lines in one's spectrum are wider than the maximum line width than one is not able to obtain a reliable 
estimate of the line width for any emission line, and therefore the source is not used to estimate 4>(Mbh, z). A lower 
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limit on the line width may be imposed in order to prevent the inclusion of narrow line AGN, for which broad line 
mass estimates are not valid. In this case the inclusion criterion might be that at least one emission line is broader 
than, say, FWHM = 2000 km s _1 . In addition to the limits on line width that may be imposed, there is an upper 
and lower limit on z due to redshifting of emission lines out of the observable spectral range. For example, if one uses 
optical spectra than the range of useable spectra is < z < 4.5, as the C IV line redshifts into the near-infrared for 
z>4.5. 

Denote the upper and lower limit of v as u m in and v max , and the upper and lower limit of z as z rn i n and z max . 
Furthermore, denote the usual survey selection function in terms of L\ and z as s(L\,z), where s(L\,z) is the 
probability that a source is included in the survey before any cuts on line width are imposed; s(L\, z) would typically 
correspond to the selection function used in luminosity function estimation. Note that in this work s(L\,z) gives 
the probability that any source in the universe is included in the survey, given its luminosity and redshift, and thus 
s(L\, z) < Vl/A-k, where J7/47T is the fraction of the sky covered by the survey. Then, p(Ii — l|vj, L\^, zi) — s(L\j, Zi) 
if z m i n < Zi < z max and at least one emission line has v m i n < Vi < v max ] otherwise, p(Ii — l|v{, L\^, z{) = 0. In this 
case, the probability that a source is included in the survey (see Eq. [7] ) is 

p(I = l\d) = 

/ / s(L x ,z) / p(-v\Lx,Mbh,O)p(Lx\Mbh,z,0)p(M B h,z\6) dM BH dv dz dL x , (10) 

JO J z min Jv mirl JO 

where the inner two integrals are over v and Mbh, and the outer two integrals are over L\ and z. One can then insert 
Equation (|10p into Equation ([5]) to get the likelihood function. 

It is informative to express the selection function in terms of black hole mass and redshift. The selection function 
as a function of black hole mass and redshift is the probability of including a source, given its Mbh and z, and is 
calculated as 

p(I = 1\M B h,z) = s(L x ,z)p{Lx\Mbh,z) p(v\L x , Mbh) dL x dv. (11) 

J0 Jv min 

At any given value of Mbh a range of luminosities and emission line widths are possible, and thus sources with low 
black hole mass can be detected if they are bright enough and have line widths i> m m < v < v max . Conversely, sources 
with high black hole masses can be missed by the survey if their luminosity is below the flux limit at that redshift, or if 
their line width falls outside of the detectable range. This has the effect of smoothing the survey's selection function, 
and thus the black hole mass selection function is a broadened form of the flux selection function. 
As an example, consider the case when the selection function is simply a flux limit. In this case, the selection function 

is 

s(l z) = { 1 if 47T fminD 2 L (z) < L\ < 4nf max Dl(z) ^ 
* ' ' 1 otherwise ^ ' 

where f m in is the survey's lower flux limit, f max is the survey's upper flux limit, and Dl(z) is the luminosity distance 
to redshift z. For simplicity, in this example we assume that there is no additional cut on emission line width, i.e., 
v m in — and v max — oo. In this case, the black hole mass selection function, p(I — 1\Mbh>z), is the convolution 
of the luminosity selection function with the distribution of L\ at a given Mbh- If the distribution of log La at a 
given Mbh is a Gaussian function with mean ao + a m log Mbh and dispersion 07, then the black hole mass selection 
function is 

p(I = 1\M B h z) = -I- ^°S L ^( Z ) ~ a o - amlogMBH ^ $ ^ \ogL min (z) - a Q - a m \ogM B H ^ ^ 

Here, L max {z) — / ^'Kf max D 2 L (z), L m i n (z) = AKf rn i n D 2 L (z\ and <&(•) is the cumulative distribution function of the 
standard normal distribution. 

In FigureQ]we show the black hole mass selection function, p(I = 1\Mbh, z), given by Equation (|13|) at 2=1. Here, 
we have used the SDSS quasar sample flux limit, 19.1 > i > 15, ao = 37, a m = 1, and <ji = 0.6 dex. Because the black 
hole mass selection function is the convolution of the luminosity selection function with the distribution of L\ at a 
given Mbh-i t ne black hole mass selection function is positive over a wider range in Mbh, as compared to the range 
in L\ for which s(L\, z) is positive. However, because p(l — \\Mbh, z ) spreads the selection probability over a wider 
range in Mbh, all bins in Mbh are incomplete. 

The difference in selection functions for black hole mass and luminosity results in an important distinction between 
the estimation of black hole mass functions and the estimation of luminosity functions. First, one cannot correct the 
binned BHMF for the survey flux limits by simply applying the 1/V a correction. This is a common technique used for 
estimating binned luminosity functions, where the number density in a (L\, z) bin is corrected using the survey volume 
in which a source with luminosity L\ could have been detected and still remained in the redshift bin. In the case of the 
BHMF, a survey volume in which the black hole could have been detected ceases to have any meaning, as black holes 
can be detected over many different survey volumes, albeit with varying probability. Alternatively, the 1/V a correction 
can be thought of as dividing the number of sources in a bin in (L\, z) by the detection probability as a function of 
L\ and z. Therefore, applying a l/V a correction to a bin in (Mbh,z) is essentially the same as dividing the number 
of sources in a bin in (Mbh, z) by the detection probability as a function of L\ and z. For the simple example shown 
in Figure [l] those quasars in a given bin in (Mbh, z) that happen to have luminosities L m i n (z) < L\ < L max (z) will 
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Fig. 1. — Comparison of the selection function for luminosity (shaded region) and black hole mass (curve) for a simple upper and lower 
flux limit. The selection is complete in luminosity within the flux limits, but is never 'complete' in Mbh- The intrinsic physical range in 
luminosity at a given black hole mass creates a more complicated selection function for Mgu, since at any given Mgjj and z only those 
quasars with luminosities within the flux limits are detected. 



receive no correction, since s(L\,z) = 1. However, those quasars which have luminosities outside of the detectable 
range will not be detected. The end result is a systematic underestimate of the binned BHMF. 

The number of sources in a given bin in the Mbh~z plane can be estimated by dividing the observed number of 
black holes in each bin by the black hole mass selection function, p{I = 1\MbHi z )- Similarly, one can use a l/V^-type 
correction by calculating an 'effective' l/V a , found by integrating dV/dz over the black hole mas s selection function . 
This approach h as been adopted previously within the context of binned luminosity functions (e.g.. lWarren et al]|1994t 
iFan et all 1200 1[ ). However, it is essential that the black hole mass selection function be used and not the luminosity 
selection function. Unfortunately, this implies that one must assume a form for p(L\\Mbh, z). 

3. POSTERIOR DISTRIBUTION FOR THE BHMF PARAMETERS 

The posterior probability distribution of the model parameters is 

p(9, N\v obs ,L obs , z b s ,I) oc p(9, N)p(v obs ,L obs ,z obs , I\0, N), (14) 

where p(9, N) is the prior on (6, N), a,nd p(v obs , L obs , z obs , I\8, N) is the likelihood function, given by Equation §5§. The 
posterior distribution gives the probability that 9 and N have a given value, given the observed data (v Q b s , L obs , z obs ). 
Therefore, the posterior distribution of 9 and N can be used to obtain the probability that 4>(Mbh, z) has any given 
value, given that we have observed some set of emission line widths, luminosities, and redshifts. 

It is of use to decompose the posterior as p(N, 9\x obs ) oc p(N\9,x obs )p(9\x obs ), where we have abbreviated the 
observed data ). This decomposition seperates the posterior into the conditional posterior of 

the BHMF normalization, p(N\x obs ,9), from the marginal posterior of the BHMF shape, p{9\x obs ). In this work we 
take N and 9 to be independent in their pri or distribution. p(N , 9) — p{N)p{9), and that the prior on N is uniform 
over \ogN. In this case, one case show (e.g.. iGelman et al"1l2004l KFV08) that the marginal posterior distribution of 
9 is 

p(9\v obs ,L obs ,z obs ) <xp(0) [p{I = l|0)] _n Y[ K v i>-kA,i,2i|0), (15) 

ieA„b S 

where p{I = l\d) = 1 - p(I = 0\9). 

Under the prior p(logiV) oc 1, the conditional posterior of N\9, x obs is a negative binomial distribution with param- 
eters n and p(I = 1\9). The negative binomial distribution gives the probability that the total number of sources is 
equal to N, given that there have been n detections with probability of detection p(I — 1\9): 

p(N\n, 9) = C^ 1 W = l\9)T W = mf' n ■ (16) 

Because of the large number of parameters in the model, Bayesian inference is most easily performed by randomly 
drawing values of N and 9 from their posterior. Based on the decomposition p(9, N\x obs ) oc p(N\n, 9)p(9\x obs ), we 
can obtain random draws of (N,9) by first drawing values of 9 from Equation {T5]). Then, for each draw of 9, we 
draw a value of N from the negative binomial distribution. Random draws for 9 may be obtained via markov chain 
monte carlo (MCMC) methods, describ ed in jj [5l and rando m draws from the negative binomial distribution are easily 
obtained using standard methods (e.g.. IGelman et alj|200l KFV08). 
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4. THE STATISTICAL MODEL 

In order to compute the likelihood function for the observed set of luminosities, redshifts, and broad emission line 
widths (see Eq.[9]), it is necessary to relate the BHMF to the distribution of L\ and v. To do this, Equation {8J implies 
that we need three terms. The first term is an assumed BHMF, p(Mbh, z ) = N^ 1 (dV/ dz)^ 1 (/)(Mbh , z )- The second 
term is an assumed distribution of luminosities at a given black hole mass and redshift, p(L\\Mbh, z). The third term 
is an assumed distribution of broad emission line widths at a given luminosity and black hole mass, p{v\L\, Mbh)- 
Once we have a parameteric form for each of these three distributions, we can calculate the observed data likelihood 
directly from Equation ((9|). In this section we describe parameteric forms for each of these distributions based on a 
mixture of Gaussian functions model. 



4.1. Mixture of Gaussian Functions Model for the BHMF 

The mixture of Gaussian functions model is a common 'non-parameteric' model that allows flexibility when estimat- 
ing the BHMF. The basic idea is that one can use a suitably large enough number of Gaussian functions to accurately 
approximate the true BHMF, even though the individual Gaussian functions have no physical meaning. Furthermore, 
the Gaussian mixture model is also conjugate to the distributions p(L\\m) and p(y\L\, m) assumed in ^4.21 and 14.31 
thus enabling us to calculate some of the integrals in Equation (fT"5|) analytically. 

In KFV08 we described a mixture of Gaussian functions model for a luminosity function. The mixture of Gaussian 
functions model of the BHMF is identical to that for the luminosity function, after replacing L with Mbh- Our 
mixture of Guassian functions model, including our adopted prior, is described in KFV08; for completeness we briefly 
review it here. 

The mixture of K Gaussian functions model for the BHMF is 



K 



p(logM/j//,logz|7r,//,£) = ^2 



7i> 



■ exp 



-^(y- w 



" s fc X (y-Mfc) 



(17) 



where 53fe=i n k = 1- Here, y = (log Mbh, log z), fJ,k is the 2-element mean vector for the fc th Gaussian functions, 
S/c is the 2x2 covariance matrix for the fc th Gaussian function, and x T denotes the transpose of x. In addition, we 
denote n — (tti, . . . , ttr-), [i = (jii, . . . , Hk), and £ = (Si, . . . , £«-). The variance in log Mbh for Gaussian function k 
is cr^ k — Sn.fc, the variance in logz for Gaussian function k is a z k — £22, fc, and the covariance between log Mbh 
and logz for Gaussian function k is o~ mz .k = £12, fc- Note that Equation (|17jl is equivalent to assuming that p(Mbh, z) 
is a mixture of log-normal densities. Under the mixture model, the BHMF can be calculated from Equations |1| and 
(fl~7)) . Noting that p(M B H,z) = p(log M B h, \ogz)/(M B Hz(\n 10) 2 ), the mixture of normals model for the BHMF is 



4>{M BH ,z\0,N) 



N 



M B Hz(lnW) 2 \dz J 



dV\ 



-1 K 



2^|£ fc l 1 /2 



exp 



(18) 



where, as before, y = (log Mbh, log z). 



4.2. The Distribution of L\ at a Given Mbh 

We model the distribution of luminosities at a given Mbh as a log-normal distribution, where the average log L\ at 
a given Mbh depends linearly on log Mb//: 



p(logL x \M B H,a) 



1 



exp 



1 /log - n,, 
2 



log M B h 



(19) 



Here, the unknown parameters are a = («o, a m , af). This is equivalent to assuming a simple linear regression of 
log La on log Mbh, where ao is the constant, a m is the slope, and o~i is the standard deviation of the random Gaussian 
dispersion about the regression line. We assume a uniform prior on these parameters, i.e., p(ao, a m , a{) oc 1. 

The form of the Mbh~L\ relationship given by Equation (flU|) is motivated by noting that L\ can be related to 
M B h as 



\L X = 1.3 x lO 38 -^ 



C x M(7 



[erg s x ], 



(20) 



where TEdd = Lboi/LEdd is the Eddington ratio, and C\ is the bolometric correction to \L\. Equation (|20|) implies 
that the distribution of luminosities at a given black hole mass is caused by the distribution in Eddington ratios and 
bolometric corrections at a given black hole mass. The distribution of log La at a given Mbh is the convolution of the 
distribution of logT^dd at a given Mbh, with the distribution of logCA at a given Mbh- The parameter 01 is thus an 
estimate of the dispersion in log(r 'Edd/C\) at a given Mbh- 

If both TEdd and C\ are statistically independent of Mbh, then we would expect that on average La oc Mbh, i-e., 
a rn — 1. However, if T^dd or Ca are correlated with Mbh, then a m 7^ 1. Currently, it is unknown whether Mbh 
and TEdd are correlated. However, it is likely that quasar SEDs depend on both T B dd and Mbh, and therefore the 
bolometric correction will also depend on T Edd an d Mbh- Indeed, recently so me authors have foun d evidence that th e 
bolometric correction depends on Eddington ratio (|Vasudevan fc Fabianll2007T ) and black hole mass (|Kellv et al.l f2008). 
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Therefore, it is likely that a rn 7^ 1, and we leave it as a free parameter. In addition, comparison of Equation (|19p 
with Equation (|20|) . and assuming that TEdd/C\ is independent of Mbh, implies that the average value of TEdd/C\ 
is related to «o according to E(\ogTEdd/C\) = a n — 38.11, where E{x) denotes the expectation value of a;. Therefore, 
one can use ao to estimate the typical broad line quasar Eddington ratio, assuming a typical bolometric correction. 

Currently, there is little known about the distribution of luminosities at a given black hole mass, so for simplicity 
we assume the simple linear form given by Equation (|19|) . Furthermore, the assumption of a Gaussian distribution 
in log L at a given Mbh is consistent with the L-Mbh relationship for those AGN with reverberation mapping 
data (Kelly & Bcchtold 2007). More sophisticated models could include a non- linear dependence on log Mbh, an 
additional rcdshift dependence, or non-Gaussian distribution. Unfortunately, this introduces additional complexity 
into the model. Furthermore, an additional redshift dependence in Equation (|19[) implies that the distribution of TEdd 
or C\ at a given Mbh evolves. However, currently most investigat i ons have not found any evidence for sig nificant 
evolution in Tsdd fe-g- ICorbett et al.ll2003tlMcLure fc Dunlopll2004l IVestergaar'd 2004; K olimeier et~aT1 12009 ). and it 
is unclear if the quasar SED evolves at a given Mbh- Therefore, there is currently no compelling evidence to justify 
inclusion of a redshift dependence in Equation (JT5J) . In addition, we note that it is impossible to use p(L\Mbh) to 
infer the distribution of Eddington ratios without making an assumption about the distribution of Choi , as Equation 
(|20p shows that T^dd and C(, i are degenerate. While estimating the distribution of T^dd is of significant interest, it 
is beyond the scope of this work to develop a robust technique to do so, as our goal is to estimate the black hole mass 
function. 

Because of the large number of parameters, large uncertainty in the broad line black hole mass estimates, and flux 
limit, estimating the BHMF is already a difficult statistical problem. As such, our approach is to initially assume the 
simple form given by Equation (|19p in order to keep the degrees of freedom low, and to check if this assumption is 
consistent with our data (see ? 16. 3p . If it is found that the observed data are inconsistent with this statistical model 
(e.g., see § 16. 3|) then Equation (fl9|) should be modified. 

4.3. The Distribution of v at a given L and Mbh 

Following iKellv fc Bechtolcfl (pOOl . we can derive the distribution of emission line widths at a given luminosity and 
black hole mass. Given an AGN luminosity, Lf L , the BLR distance R is assumed to be set by the luminosity according 
to the R-L relationship, R cx , with some additional log-normal statistical scatter: 



p(logi?|Lf L ) = -===exp 

V 27R7 r 



1 { logR-rp-PiiogL f 1 - 



(21) 



Here, r is a constant, ay is the dispersion in logi? at a given luminosity, and Lf L is the AGN continuum luminosity 
at some reference wavelength appropriate for the broad emission line of interest. Note that the reference wavelength 
for Lf L is not necessarily the same wavelength as for L\ used in § 14.21 In particular, the wavelength for L\ used in the 
Mbh~L\ relationship should be chosen to adequately account for the selection function, while the reference wavelength 
for L^ L should be appropriate for describing the R-L relationship. Since AGN continua are well described by a power- 
law, /„ oc v~ a , it should be easy to calculate L\ at different values so long as the spectral index, a, is known. The 
intrinsic scatter in R at a given L^ L is likely due to variations in quasar SED, reddening, non-instantaneous response 
of the BLR to continuum variations, etc. 

Assuming that the BLR gas is gravitationally bound, the velocity dispersion of the broad line-emitting gas is related 
to R and Mbh as Mbh = fRv 2 /G. Here, G is the gravitational constant, and / is a factor that converts the virial 
product, RMbh/G, to a mass. We do not directly measure v, but instead estimate it by the FWHM or dispersion 
of the broad emission line in a single-epoch spectra. As a result, the measured line width will scatter about the actual 
value of v, where this scatter may be due in part to variations in line profile shape and the existence of stationary 
components in the single-epoch line profile. In our statistical model we assume that this scatter is log-n ormal with a 
dispersion of ay. In addition, the value of / depends on the measure of line width used. lOnken et al.l (|2004l ) estimated / 
by comparing black hole masses derived form reverberation mapping with those derived from the Mbh~o~ relationship, 
and find that on average / = 1.4 ± 0.4 when using the FW HM. This value is consistent with a value of / = 0.75 
expected from a spherical BLR geometry (e.g., [Nctzer 1993). 

Under our model, the distribution of emission line widths at a given BLR size and black hole mass is 



logw|i?, M BH = j== cxp \ -- 

V 27r(J i { 2 



log v-Vq- l/2(log / + log R - log M B h) 



(22) 



where vq is a constant. For convenience, here and throughout this paper we denote the estimate of the BLR gas 
velocity dispersion as v, i.e., v is either the FWHM or dispersion of the emission line. The term v in Equation (|22[) 
should not be confused with the actual velocity dispersion of the BLR gas, but is an estimate of it based on a measure 
of the width of the broad emission line. From Equation (|22|) it is apparent that the term / shifts the distribution of 
\ogv by a constant amount, which has the effect of shifting the inferred BHM F by a constan t amount in log Mbh- 
Throughout the rest of this work we assume the value of / = 1.4 found bv lOnken etaTI (|2004h . 

The distribution of v at a given L and Mbh is obtained from Equations (f2Tj) and (|22|) by averaging the distribution 
of v at a given R and Mbh over the distribution of R at a given L^ L • 

/oc 
p(logu|log R, M BH , f3) P (logR\L* L ,f3) dlogR (23) 
-OO 
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exp 



1 



V^Tl 2 



logv - /3o - 1/209, log Lf L - \ogM BH ) 



&BL 



(24) 



where 0o is a constant, <j 2 BL = c 2 + of/4, and (3 = (0q, 0i, (Jbl)- Note that in Equation (|2"4|) we have absorbed 
log / into the constant term, /3 - The term ct^l is the dispersion in emission line widths at a given luminosity and 
black hole mass, and can be related to the intrinsic uncertainty in the broad line estimates of Mbh- The usual 
broad line mass estimates of AGN can be obtained by reexpressing the mean of Equation ([24]) in terms of Mbh'- 



logMeL = f3i\ogLf L + 2 logv — 2/?o, or equivalently Mbl oc L^ bl v 2 . The intrinsic uncertainty on the broad line 
mass estimates is set by a combination of the intrinsic dispersion in R and at a given L, and the uncertainty in 
using the single-epoch line width as an estimate of the broad line gas velocity dispersion: &m bl = 2o\bl- Equation 
(|2~4"|) describes the stat istical uncertainty in the broad line mass estimates, and does not account for any additional 
systematic errors (e.g.. iKrolik 2001: [Collin et al"1l2006D . 

It is typically the case that one employs multiple emission lines to estimate Mbh, producing black hole mass 
estimates across a broad range of redshifts and luminosities. In our work, we use the H/3, Mg II, and C IV emission 
lines. In order to facilitate the use of different emission lines in the BHMF estimation, we introduce an indicator 
variable denoted by 5. Here, Sup = 1 if the H/3 line width is available, and Sb_/3 = if the H/3 line widths is not 
available; SmqII and Sciv arc defined in an equivalent manner. For example, if one is using optical spectra, then at 
z = 0.4 only the H/3 emission line is available, and therefore (5h/3 = 1, SmqII = 0, and 5civ — 0. 

Assuming that the line width distributions for each line are independent at a given luminosity and black hole mass, 
then the observed distribution of line widths is the product of Equation ([24jl for each individual emission line: 

p(logv\L,M B H,z,0) = 

[N(\ogv H p\v HI3 ,<j 2 ll3 )] SH ' 3 [A^(logWMgii|vMgii,cr^ gII )]' 5MgI1 [A r (logu C iv|wciv,crciv)]' 5ciV (25) 

Here, the average line width for H/3 is v w = 0^ - (l/2)/3, H/3 \ogL^ p + (1/2) log Mbh, and likewise for Mg II and 
C IV. Here, L^f denotes the value of L\ that is used to calibrate the broad line mass estimates for H/3, typically 
L a(5100A). 

IVestergaard fc Peterson! (|2006[ ) give equations for calculating broad line mass estimates fro m H/3 and C IV, derived 
from the most recent reverberation mapping data ([Peterson et alJ [20041 : iKaspi et al.ll2005l ). and IVestergaard et al.l 
(2008, in progress) give an equation for calculating a broad line mass estimate from Mg II. These mass scaling 
relationships are: 

log M H/3 = -21.09 + 0.50 log XL X (5100l) + 2 log FWHM W (26) 
log M Mg n = -21.21 + 0.50 log XL X (21001) + 2 log FWHM Mg u (27) 
log M C w = -22.66 + 0.53 log \L X (1350A) + 2 log FWHM CW (28) 
For the equations listed above we have used the FWHM of the emission line as an estimate of the velocity dispersion, 
i.e., v = FWHM. Because log Mbl = 0i log XLf L + 21ogw - 2/3 , it follows that 0^ = 10 - 55, 0^ gU = 10.61, 0g IV = 
11.33, and 0i ~ 0.5 for all three emission lines. In addition, Vcstcrgaar d fc Peterson! (|2006T i find the statistical 
uncertainty in the broad line mass estimates to be 0.43 dex and 0.36 dex for H/3 and C IV, respectively . Therefore, 
since obl= a Ai BL /2, if follows that cth/s ~ 0.22 and ctciv ~ 0.18 dex. Likewise. IVestergaard et all (|2008l in progress) 
find the intrinsic uncertainty in the broad line mass estimate for Mg II to be ~ 0.4 dex, and therefore CMgii ~ 0.2 
dex. Ho wever, this statistical uncertainty may be smaller if a correction is made in the virial relationship for radiation 
pressure ([Marconi et al.ll2008h . 

Broad li ne mass estimates are now fair ly we ll understood, and we der ive our prior distribution for from the scaling 
results of IVestergaard fc Peterso n (2006) and IVestergaard et alJ (|2008l in progress). We fix 0i = 0.5,0.5, and 0.53 for 
H/3, Mg II, and C IV, respectively. However, in order to account for the uncertainty in these scaling relationships, 
we consider @q and <tbl to be free parameters in our model. We cannot estimate the normalization and statistical 
uncertainty in the broad line mass estimates solely from the distribution of v, L, and z, since /3o and obl are degenerate 
with the other parameters. Therefore, it is necessary to place constraints on /3o and <jbl through a prior distribution. 
This allows us to constrain /3o and obl while still incorporating their uncertainty. The parameters for th e prior 
distri bution of /3p and obl a re based on the uncertainty in the scaling relationships of IVestergaard fc Peterson! ([2006! ) 
and IVestergaard et alJ (2008, in progress). Our prior for O are independent Gaussian distributions with means equal 
to 10.55, 10.61, and 11.33 for H/3, Mg II, and C IV, respectively, and standard deviations equal to 0.1. To allow 
greater flexibility in our model, we chose the prior standard deviation o f 0.1 to be wider than the formal uncertainty 
on the scaling factors of w 0.02 reported by IVestergaard fc Peterson] (2006). For each emission line, our prior for 
<jbl is a scaled inverse-x 2 distribution with v = 25 degrees of freedom and scale parameter equal to 0.2 dex. We 
chose v = 25 degrees of freed om because approximately 25 AGN were used to derive the scaling relationships in 
IVestergaard Sz Peterson! (120061) . T he v alues of 0q were constrain ed to be within ±0.3 (i.e., ±3<r) of the values reported 
bv IVestergaard fc Peterson! (|2006h and IVestergaa rd" et al.l (|2008l in progress), and the values of cjbl were constrained 
to be within the inverval containing 99% of the probability for the scaled inverse-x 2 distribution. By placing these 
constraints on 0q and <jbl, we ensure that their values remain consistent with the results derived from reverberation 
mapping. 
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4.4. Likelihood function for Mixture of Gaussian Functions Model 

Now that we have formulated the conditional distributions, we can calculate the likelihood function for the mix- 
ture of Gaussian functions model of <fi(M BB , z). Comparison with Equation (|15p suggests that we need two terms: 
p(vj, L\ j, Zj\9) and p(I — 1\9). The first term is the joint distribution of line widths, luminosities, and redshifts: 

P(vi, L Xii , Zi\6) = J p{v i \L Xt i,M BHA ,P)p(L X4 \M BH ^,a)p(M BH ^,z i \Tr,fi, E) dM BH , n (29) 

where = (a, /3, n, fj,, E). 

The integral in Equation (|2^|) can be done analytically by inserting Equations (TIT)) , (JT3J) , and (|2"5)l into Equation 
(J2U). However, the result depends on the number of emission lines used for the i th source. Expressing the likelihood 
function for a single emission line in terms of logarithms, p(logVi, log L x ^, log Zi\9) is a mixture of K 3-dimensional 
Gaussian functions: 

K 7T f 1 

Hlog^,log£A,i,logZi|fl)=y] / ^ T ^ exp^--(x,-6) T ^r 1 (x i (30) 

(31) 
(32) 
(33) 

(34) 
(35) 

(Var (log v\k) Cov (log t) , log l\k) Cov (log v , log z\k) y 
Cov (log v, log l\k) Var(logl\k) a m a mz ,k ] (36) 

Cov(logv, log z\k) a m o- m z,k cr 2 zk 

Var(logv\k)=a 2 BL + \[[5 2 V ar (log l\k) + (1 - a m )< fc ] (37) 

Var(logl\k) = a? + a 2 m o- 2 m . k (38) 

Cov(logv,logl\k) = ~a m o-l ltk - ^frVar (log l\k) (39) 

Cou(log v, log z|fc) = Q - i/3;a m ^ cr mz , fc . (40) 

Here, and Vfc are the mean vector and covariance matrix of (logUj, log La, i, logz^) for the k th Gaussian function, 
respectively. In addition, l k is the mean log La for Gaussian function k, Vk is the mean v for Gaussian function k, 1 B l is 
the mean logL^ for Gaussian function k, Var(logv\k) is the variance in logw for Gaussian function k, Var(logL x \k) 
is the variance in logLA for Gaussian function k, Cov (log v, log L x \k) is the covariance between logf and logL x for 
Gaussian function k, and Cov(log v, log z\k) is the covariance between log v and log z for Gaussian function k; note that 
ot m o~mz,k is the covariance between log La and z for Gaussian function k. The mean \ogL x L for Gaussian function k is 
calculated from l k assuming a power-law continuum of the form Lf L — La(Abl/Aml) Qa , where X B l is the wavelength 
used in the R-Lf L relationship for the emission line of interest, and Xml is the wavelength that the M BB -L X is 
formu lated in. For example, X B l = 5100A for the H/3-based mass scaling relationship of IVestergaard fc Petersonl 
(2006), and Xml may be, say, equal to 2500A. Note that we are assuming that a x is known. 

In Equation (|30[) it should be understood that Vi,f3o,pi, and o BL correspond to the particular emission line being 
used. For example, if one is using the C IV line width for the i th source, then Vi — vciv,i, Po — Pq IV > Pi = Pf IV > an d 

2 2 

a BL — a CIV- 
ll there are two emission line widths available for the i th AGN, then p(vi, L Xy i, zi\9) is a mixture of K 4-dimensional 
Gaussian functions: 

p(log Vi , log L Aii , Zi\ 9) = ygpjfrj ex P { - \ (** - V k -' ( Xi - &) } (41 ) 

Xj = (log u lti , log v 2 .i , log L x .i, log Zi) (42) 
ffc = (wi,fc,«2,fc,ik,M2,fc) ( 43 ) 
Co«(log«i,Iog«a|fc) = i (/3 l , 1 f3 l , 2 Var(logL x \k) + al l ) (44) 

Here, Cov (log ui, logi^lfc) denotes the covariance between the logarithms of the two line widths, v% and for the fc th 
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Gaussian function. The 4x4 covariance matrix of (logVi,logLA.i,log.2i) is 

(Var (log vi\k) Cov (log vi, log V2\k) Cov(logvi,\ogL\\k) CW(log«i,log,z|/c) 

Gov (log v u log v 2 1 k) Var{\ogv 2 \k) Ccw(logt)2,logLA|fc) Cov (log v 2 , log z\ k) . 

Cov (log vi , log L\ | k) Cov (log v 2 , log L x \ k) Var(logL x \k) a m a mz , k \ ' ' 

Cov (log vi, log z | k) Ccw(logu 2 ,logz|£;) Oi m a mz . k <j 2 z k 

The other terms are given by Equations (j3~4")) (jit))) , where it should be understood that /?o, A, and cr^ L correspond to 
the values appropriate for each emission line. For example, at z ~ 0.6 both H/3 and Mg II are observable in the optical 
spectral region, and thus it is possible to have line widths for both emission lines. In this case, is the logarithm of 
the H/3 width for the i th source, v 2 ,i is the logarithm of the Mg II width for the i th source, (3i : \ corresponds to /3; for 
the H/3 line, and f3i t2 corresponds to fti for the Mg II line. The labeling of the H/3 line width as v\ is irrelevant, and 
the same result would be obtained if we had labeled the H/3 line width as v 2 . 

It should be noted that in Equation (|41j) we have made the assumption that if at least one emission line has 
Vmin < v < v max , then v is estimated for all emission lines in the observable spectral range at that redshift. If this is 
not the case, then Equation (|4Tj) must be integrated over vu or v 2 ,i if either of v\^ or v 2i i fall outside of (v m i n , v max ). 

The term p{I = 1\9) is the probability that a source is included in one's sample for a given set of model parameters 
9. Under the mixture of Gaussian functions model, Equation (|10|) can be simplified, allowing more efficient calculation. 
However, as above, the actual functional form of p(I = 0\9) depends on the number of emission lines used in broad 
line mass estimation. If only one emission line is used, then Equation (|10[) becomes 



p(I=l\8)=n /"""" S ^ X ' Z J y^iT k f v (Lx,z,8,k)N 2 (y lz \y lz . k ,V lz . k ) dz dL x (46) 
J-ooJz min zlnl ° 

y lz = {\ogL x ,\ogz) (47) 

yiz,k = (h,Vz,k) (48) 
_ /Var (log L x | k)a m a mz , k \ 

lz,k I n n n 1 I V / 



0. 1n (7 7rlz k 0~ z 

The term f v (L x , z,6,k) is the probability that a source has at least one line width between v m in and v ma x for 
the fc th Gaussian function, given its luminosity and redshift. For redshifts where only one emission line is used, 
f v (L x ,z,0,k) = Pr(v min <v< Vmax^x.z.O.k), where 

Pr(v mm < v < v max \L x , z,9,k) = <p( l°g W-fi(tog«|i^ A _ 3 ( log ^n^££og^A ^ \ (5Q) 

V yrVar(logv\L x ,z,k) J \ y/VarQagv\Lx, z, k) J 

E(logv\l,z,k)=v k + clV l ~j.(yi z -yiz.k) (51) 

Var(\og v\L x , z, k) = Var(\og v\k) - cjfy-£c£ (52) 

c fe = [Cow(logw,logL A |fc),Cow(logt;,logz|fc)] . (53) 

Here, $(•) is the cumulative distribution function of the standard normal distribution, E(\ogv\L x , z, k) is the mean of 
log?; for the fc th Gaussian function at a given L x and z, Var(\ogv\L x , z, k) is the variance in logf for the k th Gaussian 
function at a given L x and z, and c k is a 2-dimensional vector containing the covariances between logu and both 
log L,\ and logz. The standard normal cumulative distribution function can be efficiently computed using a look-up 
table, and therefore only two integrals need to be calculated numerically in Equation (f4"6"|) . 

If one is using multiple emission lines for estimating 4>{Mbhi z )i then f v (L x , z, 9, k) must be modified to account for 
this. Equation (|50|) gives the probability that an emission line has a line width v m i n < v < v max , under the assumption 
that only one emission line is used at any given redshift. However, if there are redshifts where two emission lines are 
used, then f v (L Xl z, 9, k) must be modified, as in these cases we need the probability that at least one emission line has 
Vmin < v < v max - At redshifts where two emission lines are used, f v (L x , z,6,k) becomes the probability that either 

Vmin ^ V 1 <C V max OI Vmin ^ V 2 < V m ax- 

f v (L X , Z, 9, k) = Pr{Vmm < Vi < V 7n ax\L X , z, 9, k) + Pr(v m in < v 2 < V max \L X ,Z, 9, k) (54) 

<v x < v m ax\L x , z,9,k)Pr(v mm < v 2 < v max \L x , z, 8, k), (55) 

where Pr{v m in < Vj < v ma x\L x , z, 9, k) are given by Equation (|50|) for j = 1, 2, respectively. 

As an example, at z ~ 0.2 only the H/3 line is available in the optical spectral region, and thus, at this redshift, 
an optical survey can only employ the H/3 line for estimating the BHMF. In this case, p(\ogVi, log La. i, log Zi\9) is 
given by Equation ([30]) . and f v (L x ,z,6,k) is given by Equation ([50]) . However, at z ~ 0.6, both H/3 and Mg II are 
observable in the optical spectral region, and thus both may be employed for estimating the BHMF. At this redshift, 
p(logVj,logLA,i,logz,|#) is given by Equation (|4"Tj) . and f v (L x , z,9,k) is given by Equation (|55|) . where v = (vi,v 2 ), 
v\ is the H/3 line width, and v 2 is the Mg II line width (or vice versa). If only one emission line is available at any 
particular redshift, either because of limited spectral range or because of a choice on the part of the researcher to 
ignore certain emission lines, then only Equations (|30[1 and (150(1 need be used. 
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The functional forms of p(v,, L\^, Zi\9) and p(I = 1\9) given above can be inserted into Equation |5} to obtain the 
likelihood function for the mixture of normals model. A maximum- likelihood estimate of c/)(Mbh, z) can be obtained by 
first maximizing Equation ([5]) with respect to N and 9 = (<Xq, ot m , of , (3q, fy, o BL , n, fi, E). Then, using the maximum- 
likelihood estimate of (TV, n, \x, E), the maximum-likelihood estimate of c/)(Mbh, z) is calculated by using Equation \17\ 
in Equation |T]). Unfortunately, for K > 1 Gaussian functions, maximizing the likelihood for the Gaussian mixture 
model is a notoriously difficult optimi zation problem. The maximizatio n is probably most efficiently performed using 
the Expectation-Maximization (EM, Demps ter. Laird. &: Rubinl Il977f) algorithm, or employing a stochastic search 
routine. Since we focus on Bayesian inference, a derivation of the EM algorithm for the BHMF is beyond the scope of 
this work. 

The posterior distribution of 9 and N can be calculated using the forms given above for p(logVi, logL^.i, Zi\9) and 
p(I = 1\9). In this case, one inserts the equations for p(\og v^, log La, i, log Zi\9) and p(I = 1\9) for the Gaussian mixture 
model into Equations (|15|) and (fl"6|) . The prior distribution, p(9), is given by Equation (21) in KFV08. 

4.5. Accounting for Measurement Error 

The preceding discussion has assumed that and L x ^ are known. However, in general, both quantities are measured 
with error. The effect of measurement error is to artificially broaden the observed distributions of Vj and L x ^. Because 
the Bayesian approach attempts to define the set of BHMFs that are consistent with the observed distribution of 
Vi,L\^, and Zi, where 'consistency' is measured by the posterior probability distribution, measurement error can 
affect statistical inference on the BHMF. If the variance of the measurement errors on and L x ^ are small compared 
to the intrinsic physical variance in these quantities, then measurement error does not have a significant effect on 
the results. In general, the measurement errors on L X:l will likely be small compared to the physical range in AGN 
luminosities, so we neglect them. This may not always be the case for the emission line widths, and in this section 
we modify the likelihood function for the mixture of Gaussian functions model to include measurement errors in Vj. 
The general method of handling measurement errors within a Bayesian or likelihood function approach is described in 
many references (e.g.. iKellvf 2007) . For the sake of brevity, we omit the derivations and simply report the modifications 
to the likelihood function. 

If one is only employing one emission line at a given redshift, then Equation (|29|) can be factored as 

p(log Vi , log L x ,i , log Zi\6) = p(\og Vi \L x ,i,Zi, 9)p(\og L x ,i , log z { \ 9) 

K 

= y"V fc p(log«j]LA,i, Zi, 9, k)p(\ogL x ,i, \og Zi\9, k) (56) 
fc=i 

Under the mixture of Gaussian functions model, the joint distribution of luminosity and redshift for the k th Gaussian 
function is obtained from Equations (|30|) — (|36[) by simply omitting the terms that depend on vf. 

p(logL x ,i,logZi\d,k) = I cy^\-\{Yiz;i-yiz,k) T V{;\{yi z ^-yi Ztk )\ . (57) 
V 47r \Vlz,k\ I 1 J 

Here, yi z ^ = (logLA^log^), yi z ,k is given by Equation (|48f and Vi Zy k is given by Equation (|49|) . The distribution of 
the measured log Vi at L x ^ and Zi for the k th Gaussian function is 

n \t a u\ 1 / 1 (\ogVi- E(logv\L x .i,Zi,k)) 2 \ 

p(logVi\L Xji ,Zi,e,k) = expt -- — — — } . (58) 

^27r[Var(logv\L x ^ Zl ,k)+al t } { 2 Var (log v\L Xji , z u k) + J 

Here, a 2 i is the variance of the measurement error on E(logv\L X ij Zi, k) is given by Equation (|5ip . and 
Var(\ogv\L x ^, Zi, k) is given by Equation ([52")) . From Equation (|58[) the effect of measurement error on the line 
width becomes apparent: the distribution of line widths at a given luminosity and redshift is broadened by an amount 
dependent on the magnitude of the line width measurement error. If a% i <C Var(\ogv\L X i, z%, k) then Equation (I56|) 
reduces to Equation (|29p . Otherwise, if measurement error on is a concern, Equations (|56p - (|58p should be used for 
Equation (|29[) instead of Equation ([50]) . 
If one is employing two emission lines at a given redshift, then Equation (|29p becomes 

Mlogv.TogiA^log^lfl) = p(\ogv ljl \L x ^, z l ,9)p(\ogv 2 ,i\L Ki , z l ,9)p(\ogL x ^ l ,\ogz l \9). (59) 

In this case, p(\ogVj t i\L Xj i, Zi, 9),j — 1, 2, must be calculated seperately for each emission line from Equation (|58p . 

5. POSTERIOR DISTRIBUTION OF THE BHMF VIA MARKOV CHAIN MONTE CARLO 

The number of free parameters in our statistical model is 6K + 8, where K is the number of Gaussian functions 
used to approximate </>(logAfB#,logz). Because of the large number of free parameters, summarizing the posterior 
is most efficiently done by using Markov Chain Monte Carlo techniques to simulate rando m draws of 9 and N from 
the posterior distribution. In this wo rk we use the Metropolis-Hastings algorithm (MHA. iMetropolis fc Ulaml 119491 : 
iMetropolis et al.lll953l : lHastingslll970h to perform the MCMC. We use the MHA to obtain a set of random draws from 
the marginal posterior distribution of 9, given by Equation (| 15|) . Then, given the values of 9, random draws for 
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may be obtained from the negative binomial distribution. A further description of the Metropolis-Hastings algorithm 
is given by KFV08, and our M HA is an extens i on of the MHA described in KFV08. For further details on the MHA 
see lChib fc Greenberd (fl995h or lGelman et~aTl (l200l . 

As in KFV08, we denote the current value of a parameter by placing a ""over its symbol, and we denote the proposal 
value by placing a * in the superscript. For example, if one were updating ao, then ao denotes the current value of 
ao in the random walk, denotes the proposed value of ao, 9 denotes the current value of 9, and 9* denotes the 
proposed value of 9, i.e., 9* — (a^ a m , of , /3 , /i, £, /to, A, T). Here, ^o,A and T are the parameters for the 

prior distribution on the mixture of Gaussian functions parameter (see KFV08). In addition, for ease of notation we 
define x a b s = (v (, s , L a b s , z Q b s ) to be the set of observable quantities. 

Our adopted Metropolis-Hastings algorithm is as follows: 

1. Start with initial guesses for ao, a m , of, /?o, <r| £ , 7r, /i, S, /io, and A. 

2. Draw a proposal value for ao and a m from a 2-dimensional normal distribution centered at the current values of 
ao and a m with set covariance matrix, E Q . The proposal values of a and a m are then simulated as (a^, a* J ~ 
_/V 2 ([ao, a m ], E a ). If p(9*\x b s ) > p(9\x b s ) then set ao = and a m — a* n . Otherwise, calculate the ratio 
r a = p(9*\x bs) I 'p(9\x b s ) and draw a random number uniformly distributed between and 1, denoted as u. If 
u < r a then set ao = a^ and a TO = a,^. Otherwise, if u > r a , the values of ao and a TO remain unchanged. 

3. Draw a proposal value for log of as log of ~ Af(21ogof, of ( ), where of ( is some set variance. Similar to before, 
calculate the ratio r a = 0[p(9*\x b s ) /&ip{9\x b s )- Here, the term o\ /07 arises because the MHA acceptance rule 
must be corrected for the asymmetry in the log-normal jumping distribution used for of. If r CT > 1 then set 
o / = o" ; *, otherwise set 01 = o\ with probability r a . This is done by drawing a uniformly distributed random 
variable as in step[5J 

4. Draw a proposal value for /3o from a normal distribution centered at the current value of (3q with set variance, 
Op. If p(9*\x b s ) > p(9\x i, s ) then set fio = (3q. Otherwise, calculate the ratio rp = p(9*\x b s ) /p(9\x b s ) and draw 
a random number uniformly distributed between and 1, denoted as u. If u < rp then set /?o = Pq- Otherwise, 
if u > rp, then the value of (3q remain unchanged. If one is employing multiple emission lines to estimate the 
BHMF, then we have found it faster to simulate proposed values of /?o for each emission line simultaneously from 
a multivariate normal distribution. 

5. Draw a proposal value for logo-^ L as logo-^ L ~ N(2 logo^ L , of Bi ), where of Bi is some set variance. Similar 

to the update for of, calculate the ratio tbl = v*BLP(®*\ x obs) I & BLP(9\x b s ) ■ If tbl > 1 then set obl = &bL' 
otherwise set obl = ®*bl with probability r a . This is done by drawing a uniformly distributed random variable 
as in step [2j If one is employing multiple emission lines to estimate the BHMF, then we have found it faster to 
simulate proposed values of logo^ £ for each emission line simultaneously from a multivariate normal distribution. 

6. Draw new values of the Gaussian mixture model parameters according to steps 2-6 in the MHA described in 
KFV08. 

One then repeats steps [5HB] until the MCMC converges, saving the values of 9 at each iteration. After convergence, the 
MCMC is stopped, and the values of 9 may be treated as a random draw from the margi nal posterior distribut ion of 9, 
p(9\x bs)- Techniques for monitering convergence of the Markov Chains can be found in lGelman et al.l (|2004h . Given 
the values of 9 obtained from the MCMC, one can then draw values of N from the negative binomial distribution (cf. 
Eq.pSj). 

Having obtained random draws of N and 9 from p(9, N\~v b s , L Q b s , z b s ), one can then use these values to calculate 
an estimate of <P(Mbh, z), and its corresponding uncertainty. This is done by using each of the MCMC draws of 9 and 
N to calculate Equation (fl8|) . The posterior distribution of <P(Mbh, z) can be estimated for any value of Mbh and z 
by plotting a histogram of the values of (/)(Mbh, z) obtained from the MCMC values of 9 and N. KFV08 illustrates 
in more detail how to use the MHA results to perform statistical inference. 

6. APPLICATION TO SIMULATED DATA 

As an illustration of the effectiveness of our method, we applied it to a simulated data set. B ecause we will eventually 
apply this method to the BHMF for the SDSS DR3 quasar catalogue (Schneider et al. 20 051) , we assume the effective 
survey area and selection function reported for the DR3 quasar sample (jRichards et al.l l2006). 

6.1. Construction of the Simulated Sample 

We construct our simulated survey in a manner very similar to that used by KFV08. We first drew a random value of 
Nq quasars from a binomial distribution with probability of success tt/4ir — 0.0393 and number of trials N = 2 x 10 5 . 
Here, = 1622 deg is the effective sky area for our simulated survey, and we chose the total number of quasars to 
be N = 2 x 10 5 in order to produce a value of n ~ 1000 observed sources after including the flux limit. While this 
produces a much smaller sample than the actual sample of ~ 1.5 x 10 4 quasars from the SDSS DR3 luminosity function 
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Fig. 2. — The true BHMF (solid red line) at several values of z, and the best K = 4 Gaussian function fit (dashed black line). In this 
case, approximating the BHMF with K = 4 2-dimensional Gaussian functions provides a good fit. The mixture of Gaussian functions 
approximation diverges from the true BHMF in the tails of the distribution of Mgjj. However, in general, the uncertainties on the BHMF 
in the tails are dominated by the statistical errors due to the small number of sources in these regions, and not by the bias introduced from 
approximating the BHMF as a mixture of Gaussian functions. 



work (Ric hards et~alll2006ft . we chose to work with this smaller sample to illustrate the effectiveness of our method on 
more moderate sample sizes. This first step of drawing from a binomial distribution simulates a subset of Nq sources 
randomly falling within an area f2 on the sky, where the total number of sources is N. Note that we have not included 
any flux limits yet. 

For each of these Nq ~ 8000 sources, we simulated values of Mbh and z. We first simulated values of log z from a 
distribution of the form 

n ^ 4r(a + 6) expK*) 

g logz = 60 

T(a)T{b) (1 + cxp(C*)) + 

where = 4(logz — 0.4). The parameters a = 2 and b = 3 were c hosen to give an observed redshift distribution 
similar to that seen for SDSS DR3 quasars (e.g. JRichards et al J 1200^ 1. 

For each simulated value of z, we simulated a value of Mbh by taking the distribution of Mbh at a given redshift 
to be a smoothly-connected double power- law. In this case, the conditional distribution of log Mbh at a given z is 



7(2)/ In 10 



1 



g{\ogM BH \z)^Ml^ 

7(z) = 2.5 + 0.51ogz 
<5(z) = 4.75 + 21ogz 
\ogM BH \z) = 7.5 + 3 log(l + 



/ M 



BH 



\M* BH (z) 



( 7 (*)+i(»))/lnlO- 



(61) 

(62) 
(63) 
(64) 



where log M BH (z) approximately marks the location of the peak in g (log M B h\z), j(z) is the slope oi log g (log Mbh\z) 



for Mbh % M BH (z), and S(z) is the slope of log (log Mbh \z) for Mbh ^ M BH (z). For our simulation, both the 

3 (log Mbh k).9(l°g z ), and therefore 



peak and logarithmic slopes of the BHMF evolve. 

The joint probability distribution of log Mbh and logz is ^(log Mbh> log z) = 
Equations (|60"|) and (|6"Tj) imply that the true BHMF for our simulated sample is 



4>o(M B h,z) oc 



N 



zMbh \dz J 



— g(logM BH \z)g(logz). 



(65) 

The constant of proportionality in Equation (1651) can be calculated by noting that 

J f 4>o(Mbh, z) <1Mbh dV = N. Figure shows 4>o(Mbh,z) at several redshifts. Also shown in Figure[2]is the best 
fit for a mixture of K = 4 Gaussian functions. Despite the fact that <fio(M B H, z) has a rather complicated parameteric 
form, a mixture of four Gaussian functions is sufficient to achieve a good approximation to 0o(Mbh, z). 
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Fig. 3. — Distribution of Eddington ratios, r Edd> f° r our simulated survey at z = 0.5, 2, and 4. The corresponding values of \L\ [2500A] 
are shown along the top of the plot for a black hole with Mgu = 10® Mq and a bolometric correction of C\ = 10. 

For each simulated black hole mass and redshift, we simulated a luminosity according to Equation (|2"0l . However, 
unlike the Gaussian distribution assumed in this work (see Eq.|19|). we assume an asymetric distribution of Eddington 
ratios that evolves as T B dd °c Vl + z. We do this in order to test the robustness of our simple assumption that the 
distribution of L\ at a given Mbh is independent of redshift and given by a normal distribution. In this simulated 
'universe', the distrib ution of T B dd does not evolve strongly, as is implied by observations (e.g., IVestergaardl l2004t 
iKollmeier et~aL|[2006l) . 

To simulate values of luminosity at a given black hole mass, we first simulated values of the Eddington ratio from a 
skew-normal distribution as 

\ogT Edd = 0.2e - 0.75|(5| - 0.3 + 0.5 log(l + z). (66) 

Here, e and S are both random deviates independently drawn from the standard normal distribution, i.e., e, S ~ -/V(0, 1). 
Figure [3] shows the distribution of TEdd at a few different redshifts. Values of \L\ were then calculated according 
to Equation (|20[) assuming a constant bolometric correction of C\ — 10 (e.g., iKaspi et al]|2000l ). For simplicity, we 
only use a constant bolometric correction for all simulated quasars. In our simulation we take A = 2500A; the choice 
of A is arbitrary and has no material effect on our results. The median Eddington ratio for our simulated sample is 
r_Edd ~ 0.25, and the dispersion in logTEdd is ~ 0.5 dex. Because the mean TEdd evolves in our simulation, and because 
the mean M BH evolves, T E dd and M BH are slightly correlated due to the shared correlation with z: T E dd °c 
Therefore, L\ oc M BB f. Comparison with Equation (fT9|) suggest that we would expect ao ~ 36, a m ~ 1.09, and 
at ~ 0.5 dex. 

For each simulated black hole mass and luminosity, we simulated broad emission line widths for H/3, Mg II, and C 
IV according to Equation (|24| . We simulated values of the H/3 line width for < z < 0.9, values of the Mg II line 
width for 0.4 < z < 2.2, and values of the C IV line width for 1.6 < z < 4.5. Note that for this simulation both H/3 
and Mg II are available at 0.4 < z < 0.9, and both Mg II and C IV are available at 1.6 < z < 2.2. Based on the most 

recent reverberation mapping data (|Kaspi et al.ll2005t iBentz et al.ll2006f ), we took R oc L\ /2 (/3 ; = 0.5) for all emission 
lines. In addition, we set /3n = 10.6, 10.6, and 10.7 for the H/3, Mg II, and C IV emission lines, respectively; these 
values were chosen to give emission line FWHM with typical values of several thousand km s _1 . The dispersion in 
the logarithm of the emission line width at a given luminosity and black hole mass was taken to be <7bl = 0.25, 0.225, 
and 0.2 for H/3, Mg II, and C IV, respectively. These values of <jbl were chosen to give broad line mass estim ate 
statistical uncertainties similar to that found from the reverberation mapping data (|Vestergaard & Petersonll2006l ). 

We randomly kept each source, where the probability of includin g a source given its lu minosity and redshift was 
taken to be the SDSS DR3 Quasar selection function, as reported bv lRichards et ail (|2006f) . In addition, we only kept 
sources with at least one emission line having a line width 1000 km s" 1 < v < 1.8 x 10 4 km s _1 . Sources with v < 1000 
were assumed to be indistinguishable from narrow-line AGN, and sources with v > 1.8 X 10 4 were assumed to be too 
difficult to distinguish from the underlying continuum and iron emission, and are thus too broad to be able to obtain 
a reliable estimate of the line width. After simulating the effects of the selection function, we were left with a sample 
of n ~ 1000 sources. Therefore, our simulated survey was only able to detect ~ 0.5% of the N = 2 x 10 5 total quasars 
in our simulated 'universe'. 

The distributions of Mbh,z,L\, and v are shown in Figure[4]for both the detected sources and the full sample. As 
can be seen, the majority of sources are missed by our simulated survey, and that the fairly 'hard' limit on luminosity 
corresponds to a much 'softer' limit on Mbh- In particular, almost all simulated quasars with Mbh ^ lO 8 Af are 
missed at z > 1, and all simulated quasars with Mbh ^ 10 7 M Q are missed at any redshift. 

To simulate the effects of using values of (3q and <jbl derived from a reverberation mapping sample, we simulated 
a sample of 25 low-z sources with known Mbh', these low-z sources were simulated in the same manner as described 
above. We then used these 25 'reverberation mapping' sources to fit /3n and (Jbl- The fitted values were then used for 
our prior distribution on /3q and <jbl as described in § 14.31 
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Fig. 4. — The distribution of Mbh , L\, and FWHM for our simulated sample. Red dots denote sources included in the sample, and 
black dots denote sources not included in the sample. In the plot of FWHM as a function of z, yellow dots denote sources with Hf3 
measurements, red dots denote sources with Mg II measurements, and green dots denots sources with C IV measurements. In the plot of 
L\ as a function of Mbh, the solid line shows the best linear regression of logL^ as a function of log Mbh, and the dashed line shows the 
Eddington limit for a bolometric correction of C\ = 10. 



6.2. Performing Statistical Inference on the BHMF with the MCMC Output 

We performed the MHA algorithm described in § [3] to obtain random draws from the posterior distribution for this 
sample, assuming the Gaussian mixture model described in § |4j We performed 10 iterations of burn-in, and then ran 
the markov chains for a n additional 3 x 10 4 . We ran five chains at the same time in order to monitor convergence (e.g., 
see lGelman et all [2004) and explore possible multimodality in the posterior. The chains had converged after 4 x 10 4 
total iterations, leaving us with ~ 1.5 x 10 5 random draws from the posterior distribution, p(9, iV|v Q b s , L Q & S , z b s ). 

In Figure [5] we show 0(log Mbh, z) at several different redshifts, on both a linear scale and a logarithmic scale. In 
general, we find it easier to work with 4>(\og Mbh, z) = In 10Mbh4>{Mbh , z), as 4>(Mbh, z) can span several orders of 
magnitude in Mbh- Figure [5] shows the true value of the BHMF, <f)Q(\og M b h , z) , the best-fit estimate of </>(log Mbh, z) 
based on the mixture of Gaussian functions model, and the regions containing 68% of the posterior probability. Here, 
as well as throughout this work, we will consider the posterior median of any quantity to be the 'best-fit' for that 
quantity. In addition, in this work we will report errors at the 68% level unless specified otherwise, and therefore the 
regions containing 68% of the posterior probability can be loosely interpreted as asymmetric error bars of length ps la. 
As can be seen, the true value of (f>(\og Mbh, z) is contained within the 68% probability region for most of the values 
of log Mbh, even those below the survey detection limit. 

We compare our method with an estimate of the BHMF obtained by combining the broad line mass estimates with 
the more traditional 1/V a estimator, developed for luminosity function estimation. We do this primarily to illustrate 
the pitfalls that can arise from e mploying broad li ne mass estimates and not properly accounting for the black hole 
mass selection function. Following iFan et al.l (|2001h . we denote the effective volume of the i th source as V*. If the i th 
source lies in a redshift bin of width Az and has a luminosity £a,<j then 

v* = J < L ^ z ){^) dz - ( 6? ) 

Dividing up the (log Mbh, z) plane into bins of width A log Mbh x Az, one may be tempted to calculate an estimate 
of <fi(logMBH, z) based on the broad line estimates of log Mbh as 

4>BL(logM BH , z) = 1 In- ( 68 ) 

A log Mbh *-f K 

Here, the sum is over all sources with broad lines estimates XogMsn < log Mbl,i < log Mbh + A log Mbh and 

z < Zi < z + Az. 

Figure [5] also displays the expected value of 4>bl for z — 0.5, 1.5, 2.5, 3.5 and 4.5. In order to estimate the expected 
value of (f>BL at each z, we simulated 10 7 quasars at each redshift interval. This produces extremely small error bars 
on 4>bl and allows us to estimate the value of 4>bl that would be obtained on average, i.e., in the limit of an infinitely 
large sample. As can be seen, 4>bl is a biased estimate of the BHMF. This bias is caused by a combination of the 
relatively large statistical uncertainties on the broad line mass estimates, which broaden the inferred BHMF, and by 
the use of the luminosity selection function instead of the black hole mass selection function in the 1/V a correction. 
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Fig. 5. — The true BHMF (solid red line) at several redshifts. The axis labels are the same for all panels, but for clarity we only place 
exterior labels on the bottom left panel. Also shown is the posterior median estimate of the BHMF based on the mixture of Gaussian 
functions model (dashed blue line), the region containing 68% of the posterior probability (shaded region), and the expected value for a 
1/Va-type binned estimate based on the broad emission line estimates, 4>bl (thin bumpy solid green line). The vertical lines mark the 50% 
incompleteness limit for a quasar with FWHM = 4000 km s — x , a typical value for the simulated sources. Note that in general the best- fit 
mixture of Gaussian functions approximation to the BHMF will not equal the true BHMF, as it is derived from a finite random sample 
drawn from the true BHMF. The bayesian mixture of Gaussian functions model is able to accurately constrain the BHMF, even below the 
survey detection limit. However, 4>bl provides a biased estimate of the BHMF. 

The large statistical uncertainties on the broad line mass estimates broaden the inferred BHMF, and therefore 4>bl 
significantly overestimates the BHMF at the high mass end, while underestimating the BHMF near its peak. In 
addition, <Pbl underestimates the BHMF at the low mass end due to the inability of the 1/V a technique to completely 
correct for incompleteness. The end result i s a sy stematic shift in the inferred BHMF toward higher Mbh, and a 
similar effect has been noted by IS hen et al.1 (|2007f ). The effective volume in Equation (|6~T)) is defined based on the 
detection probability as a function of luminosity, and not black hole mass. As mentioned in § 12.31 in order to correctly 
apply the 1/V a estimator for BHMF estimation it is necessary to obtain the black hole mass selection function, given 
by Equation (jlip . However, this requires knowledge of p(L\\Mbh, z ). Furthermore, even if there were no selection 
effects, 4>bl would still be biased because of the significant uncertainty (~ 0.4 dex) on log Mbl- 

As in KFV08, we can use the MCMC output to constrain various quantities of interest calculated from the BHMF. 
Figure [5] compares the true integrated z < 6 number distribution of log Mbh , ?T-(log Mbh , z < 6), with the mixture of 
Gaussian functions estimate. The quantity n(log Mbh, z < G)dlogMsH is the number of quasars at z < 6 with black 
hole masses between log M B h and \ogM B H + dlogM B n- KFV08 give an equation for calculating n(logL,z < z ) 
based on the mixture of Gaussian functions model (see their Eq.[42]), and nilogMBH, z < z o) is calculated in an 
equivalent manner. Similar to Figure [SJ the true value of n(logMBH,z < 6) is contained within the 68% probability 
region for most values of Mbh, even those below the survey detection limit. 

In addition, in Figure[S]we show the comoving number density of broad line AGN as a function of redshift, n(z). This 
is obtained by integrating (/)(Mbh, z) over all possible values of Mbh, given by Equation (45) of KFV08. As before, 
the true value of n(z) is contained within the 68% probability region, despite the fact that the integration extends over 
all Mbh, even those below the detection limit. The wider confidence regions reflect additional uncertainty in n(z) 
resulting from integration over those Mbh below the detection limit. In particular, the term dV/dz becomes small at 
low redshift, making the estimate of n(z) more unstable as z — > 0, and thus inflating the uncertainties at low z. 

Two other potentially useful quantities are the comoving black hole mass density for quasars, p^ S H °{z), and its 

derivative. The comoving black hole mass density is given by P%b°(z) ~ J °° Mbh4>{Mbh, z) dMsn- The quantity 

Pbh°( z ) 1S gi yen by Equation (47) of KFV08 and replacing luminosity with black hole mass. We calculate the derivative 
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Fig. 6. — The integrated z < 6 quasar number density (number per log Mbh interval, left two panels) and the comoving quasar number 
density as a function of z (number per Mpc 3 , right two panels). The top two panels show a linear stretch and the bottom two panels 
show a logarithmic stretch. As with Figure [5] the solid red line denotes the true value for the simulation, the dashed blue line denotes 
the posterior median for the mixture of Gaussian functions model, and the shaded regions contain 68% of the posterior probability. The 
posterior median provides a good fit to the true values, and the uncertainties derived from the MCMC algorithm based on the Gaussian 
mixture model are able to accurately constrain the true values of these quantities, despite the flux limit. 
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Fig. 7. — Comoving broad line quasar black hole mass density (top two panels) and its derivative (bottom two panels), shown as a 
function of redshift (left two panels) and cosmic age (right two panels) . The plotting symbols are the same as in Figure [6] As in the 

previous figures, the Gaussian mixture model is able to provide an accurate fit to the true values of Pg^(z), and the bayesian MCMC 

approach is able to provide accurate constraints on p^^°(z) and dp® { ^ / dz. despite the fact that the integral used for calculating these 
quantics extends below the survey detection limit. 



of Pg H (z) numerically. Figure [7] compares the true values of Pg H (z) and its derivative with the posterior distribution 
for pg H ' (z) inferred from the mixture model, both as a function of z and the age of the universe at redshift z, t(z). 
Comparison with Figure [5] reveals that the comoving quasar black hole mass density, is a better constrained 

quantity than the comoving quasar number density, n(z). Furthermore, n(z) appears to peak later than p^^ (z). We 
can correctly infer that the quasar comoving black hole mass density reaches it point of fastest growth at t(z) < 1 
Gyr, and its point of fastest decline at t(z) ~ 4 Gyr. 

Figure [5] quantifies the suggestion that n(z) peaks later than p% S j^{z) by displaying the posterior distribution for 
the location of the respective peaks in n(z) and p^^f (z). While the location of the peak in n(z) is highly uncertain 
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Fig. 8. — Posterior distribution for the redshift location of the peak in the comoving number density of quasars (n(z), left) and the peak 
in the comoving quasar black hole mass density (p^^°(z), right). The spike in the posterior at z for values of the peak in n(z) arises 
because the term (dV/dz)~ 1 becomes very large at low z. The vertical lines denote the true values. The posterior distribution inferred 
from the MCMC output is able to accurately constrain the true values of the argumentative maximum in n(z) and p% S jp (z). 




we can still constrain it to be z < 1.5, whereas the location of the peak in p B n (z) is constrained to occur earlier at 
2 < z < 4. This is a consequence of the fact that while there were more quasars at z ~ 1 per comoving volume, their 
black hole masses were much higher at higher redshift. This evolution in characteristic Mbh is quantified in Figure 
El which summarizes the posterior distribution for the location of the peak in </>(log Mbh , z) as a function of redshift 
and t(z). As can be seen, the location of the peak in the BHMF shows a clear trend of increasing 'characteristic' Mbh 
with increasing z, although the mixture of Gaussian functions fit has difficulty constraining the location of the peak 
at low redshift. 

As noted in § 14.21 we can use the values of ao and &l to estimate the average Eddington ratio and the dispersion 
in \ogTEdd- We find a = 35.7j^i, ot m = 1.11+ ; 10 , and ai = 0.311q'o5, where the errors are at 95% confidence. For 
a bolometric correction of C\ = 10, and assuming that YEdd is independent of Mbh, this implies that our inferred 
typical Eddington ratio is Ysdd — 0-040^o.o36 & t 95% confidence; the estimated dispersion in logr^d is simply given 
by <7i, ~ 0.3 dex. While the typical Eddington ratio that we infer from ao is roughly consistent with the actual 
median TEdd of 0.25, our estimated dispersion in Tsdd underestimates the true value of 0.5 dex. This is because we 
incorrectly assume that the Mbh~L relationship is described by Equation (fl"9)) . Our inference regarding the Eddington 
ratio distribution is therefore biased because we assume that the distribution of YEdd does not evolve, and that the 
distribution is Gaussian. In particular, the bias resulting from the assumption of Gaussian dispersion appears to 
significantly affect the estimated dispersion in logr^d more than the estimated typical value of r Edd, at least for our 
simulation. This is largely because the distribution in YEdd is skewed toward lower values of YEdd- However, because 
of the flux limit, sources with low values of YEdd are undetectable. Because the dispersion in logYEdd is estimated 
from the detected sources, in combination with the assumption of a Gaussian distribution, Equation (|19p is not able to 
pick up the additional skew at low logYEdd- As a result, the estimated dispersion in logYEdd is underestimated when 
assuming a Gaussian distribution. We note that this bias is not a feature of our algorithm, but affects any analysis 
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Fig. 10. — BHMF at z = 2.5 for the simulated sample with n ~ 1000 detected sources (left) and n ~ 10 4 detected sources (right); the 
left panel is the same as the z = 2.5 BHMF shown in Figure[5] The uncertainties derived for the n ~ 10 4 are smaller than for the n ~ 1000 
sample, particularly at low Mgg where the survey becomes incomplete. However, the uncertainties for the n ~ 10 4 survey at high Mgu, 
where the survey is complete, are not considerably smaller than those for the n ~ 1000 survey. This is because the BHMF estimate is 
limited by the systematic uncertainty in the broad line mass estimate normalization, derived from /?o, and the broad line mass estimate 
statistical error, derived from &bl- Because the observed distribution of luminosities and line widths does not convey any information on 
these two quantities, increasing the sample size will not reduce the uncertainty on the BHMF beyond the systematic uncertainty on /3o and 

that attempts to infer the distribution of Eddington ratios using a flux-limited sample. 

In order to assess how the inferred BHMF depends on the sample size, we simulated a second data set in the sammcr 
manner as described above, but used N = 2 x 10 6 sources for the BHMF normalization. This gave usn~ 10 4 detected 
quasars. In Figure [TU] we compare the estimated BHMF at z — 2.5 for the survey with n ~ 1000 sources and n ~ 10 4 
sources. The uncertainties are lower for the survey with more sources, where the most noticeable improvement occurs 
at low Mbh- However, the increased sample size did not offer a significant amount of improvement at high Mbh, where 
sources are more easily detected. This is likely because the uncertainty in the broad line mass estimate normalization, 
/3o, and intrinsic scatter, gbl-, dominates the uncertainty in the BHMF at high Mbh- Because we cannot constrain [3q 
and <jbl from the distribution of line widths and luminosities, the data do not contain any information on (3$ and obl- 
Therefore, the likelihood function is unable to convey any information on /3 and <tbli an< ^ a ^ °^ the information comes 
from the prior distribution. As a result, our ability to constrain the BHMF is limited by the statistical uncertainty on 
Po and <jbl, and an increase in the sample size will eventually not result in a decrease in the uncertainty on the BHMF. 
The only way to reduce the uncertainty on the BHMF for large surveys is to better constrain the broad line mass 
estimate normalization and statistical uncertainty, most likely by increasing the sample of AGN with reverberation 
mapping data. 

Throughout this work we have assumed that the selection function is known, and that /3o, and <jbl are known 
within some statistical uncertainty. However, this may not be the case, and before concluding this section we briefly 
discuss how systematic error in the selection function, /3 and (Tbl, affect the inferred BHMF. We did not experiment 
with incorrect selection functions, and so it is not entirely clear how robust BHMF estimation is to errors in the 
selection function. However, from Equations (fT"5|) and (fT6|) it is clear that the selection function only enters into the 
posterior probability distribution (or likelihood function) via an integral that averages the selection function over the 
joint distribution of luminosity and redshift (i.e., the luminosity function). As a result, errors in the selection function 
will be smoothed out. Furthermore, they will be suppressed in regions where values of the luminosity function are 
small. Based on this, we do not think it likely that small errors in the selection function will introduce significant 
bias into the results; however, if the errors in the selection function are large enough to significantly bias the value of 
p(I = 1|L, z), then the results may be significantly biased as well. 

It is useful to work directly with the broad line mass estimates to assess the effect that systematic uncertainty on the 
values of the broad line mass estimate normalization and statistical uncertainty have on the inferred BHMF. Ignoring 
selection effects, one can think of our method as 'correcting' the BHMF inferred from binning up the broad line mass 
estimates. Therefore, if /3o is systematically underestimated, then this will result in a shift of the inferred BHMF 
toward higher masses. Similarly, if /?o is systematically overestimated, than the inferred BHMF will be shifted toward 
lower masses. In addition, the value of <jbl controls how much the BHMF inferred from the broad line mass estimates 
is artificially broadened by the statistical uncertainty in Mbl- A higher value oIgbl will result in a greater amount of 
broadening. Therefore, if our assumed values of ubl are systematically overestimated, then we would infer a greater 
amount of broadening than is real. As a result, our correction would be too large, and we would infer an intrinsic 
BHMF that is too narrow. Similarly, if our assumed values of obl are systematically underestimated, then we would 
not correct enough for the statistical uncertainty in the broad line mass estiamtes, and we would infer an intrinsic 
BHMF that is too broad. 



6.3. Using the MCMC Output to Evaluate the BHMF Fit 
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Fig. 11. — Posterior predictive check for the Gaussian mixture model (see § 16.31 1. The histograms show the actual distributions of 
log L f, s , z b 3 , and log FW H M i, a , the red squares denote the posterior medians for the number of sources in each respective bin, and the 
error bars contain the inner 90% of the histogram values for the samples simulated from the posterior. The mixture of Gaussian functions 
model is able to provide an accurate prediction of the observed distribution of luminosity, redshift, and line widths, and thus there is not 
any evidence to reject it as providing a poor fit. 



Throughout this section we have been analyzing the MCMC results by comparing to the true BHMF. However, in 
practice we do not have access to the true BHMF, and thus a method is needed for assessing the quality of the fit. As 
in KFV08, the statistical model may be ch ecked using a technique known as posterior predictive checking (e.g., Rubin 
1981, 1983; iGelman. Meng. fc Sternl fl998t) . Here, the basic idea is to use each of the MCMC outputs to simulate a 
new random observed data set. The distributions of the simulated observed data sets are then compared to the true 
observed data in order to assess whether the statistical model gives an accurate representation of the observed data. 
It is important to construct simulated data sets for each of the MCMC draws in order to incorporate our uncertainty 
in the model parameters. 

Random draws for Mbh and z for each MCMC draw may be obtained according to the procedure outlined in § 7.3 
of KFV08, after replacing L with Mbh- Once one obtains a random draw of Mbh arid z, simulated values of L\ 
may be obtained using Equation (fT9j) with ao,a m , and er;. Then, given these values of L\ and Mbh, values of v 
for each emission line can be simulated from Equation (f!M|) using the values of A), A, and <Jbl- Simulation from 
Equation (|24p requires a value of a\ in order to convert L\ to L . In order to account for the range in continuum 
slopes, we randomly draw of value of ct\ from our data set and use this value to convert to Lf L . These simulated 
values of L\, z, and v are then folded through the selection function, leaving one with a simulated observed data set 
{v obs ,L obs , z obs ). This process is repeated for all values of N and 6 obtained from the MCMC output, leaving one with 
simulated observed data sets of (v f, s , L obs , z obs ). These simulated observed data sets can then be compared with the 
true distribution of v obs , L obs , and z obs to test the statistical model for any inconsistencies. 

In Figure fTTI we show histograms for the observed distributions of z, log La, and log FWHM for the H/3, Mg II, and 
C IV emission lines. These histograms are compared with the posterior median of the observed distributions based on 
the mixture of Gaussian functions model, as well as error bars containing 90% of the simulated observed values. As 
can be seen, the distributions of the observed data sets simulated from our assumed statistical model are consistent 
with the distributions of the true observed data, and therefore there is no reason to reject the statistical model as 
providing a poor fit. 

7. APPLICATION TO BQS QUASARS 

As a final illustration of our method we use d it to estimate the low redshift active BHMF from the 87 z < 0.5 
quasars from the Bright Quasar Survey (BQS, ISchmidt fc Greer] 119831). The H/3 line width s and continuum lumi- 
nositie s for 71 of the BQS qua sars are taken from Table 7 of IVestergaard fc Peterso n (2006), and 16 of the quasars 
in the iBoroson fc Greenl (|1992l ) sample have black hole mass estimates from reverberation mapping ([Peterson et al.l 
200 4|). For each source with reverberation mapping data, we used the first entry of AL A (5100A) in Table 1 of 
Vestergaard fc Peterson! (120061 ) as t he single-epoch luminosit y; these values were based on continuum luminosities 
reported bv IBoroson fc Greenl (|1992h or lMarziani et all (|2003h . We assumed measurement errors of 10% on the emis- 
sion line F WHM. The BQS samp le covers an area of il = 10, 714 deg and is selected with an a yerage flux limit o f 
B = 16.16 (Schmi dt fc Greenl 1 19831 ). with no apparent correlation with redshift and U — B color ([Jester et al.l feOOS'). 
We conv erted the B = 16.16 flux limit to a flux limit at 5100A assuming a power law continuum, ] v oc v~ a , with 
a = 0.5 (Richards et al.ll2001h . We used K = 3 Gaussian functions to fit 4>(Mbh, z) for z < 0.5. 
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Fig. 12. — The z = 0.17 (left) and z = 0.5 (right) broad line quasar BHMF as estimated from the BQS sample. The dashed line denotes 
the posterior median for the mixture of Gaussian functions model, the shaded region contains 68% of the posterior probability, and the 
tick marks along the x-axis mark the locations of the broad line mass estimates. The estimate of the z = 0.17 BHMF becomes significantly 
uncertain at M BH < 10 s Mq, and the z = 0.17 BHMF appears to fall off as a power law above M BH > 10 s M e . The z = 0.5 BHMF is 
not very well constrained, but th ere is evidence for a shift in the BHMF toward higher Mgu from z = 0.17 to z = 0.5. For comparison, 
we show the BHMF esti mated by Vcstcrgaard (2006) using the BQS sources (left, solid line with error bars), and the BHMF estimated by 
IVestereaard et alj ||2008T ) using the SDSS DR3 quasars (right, solid line with error bars). The shift in the BHMF inferred from the binned 
mass estimates is apparent in the BQS sample, while the SDSS and BQS z = 0.5 BHMF estimates agree fairly well. 

Because we are including the actual values of Mbh for the 16 reverberation mapping sources, the contribution to 
the posterior for these sources is 

16 

p(0\M BH ,L x ,z) = Y[p(logL x , i \M BH ,i,e)p{logM BH , i ,logz i \e). (69) 

i=l 

Here, p(L\^\M B H,i,0) is given by Equation (fH?)) and p(\og M B H,i, log Zi\9) is given by Equation (|17[) . The product in 
Equation (|69p is only over the quasars with M BB estimated from reverberation mapping, whereas the contribution 
to the posterior for the BQS sources without reverberation mapping is given by Equation (fT5|) . The posterior for the 
complete BQS sample is then the product of Equation (|69f and Equation (|15p. 

In Figure [T2l we show the z = 0.17 BHMF derived from the BQS sam ple. Also shown is the binned BHMF for the 
BQS sources, calculated directly from the broad line mass estimates by IVestergaardl ([20061 ) . We show the BHMF at 
z = 0.17 because the average redshift of the BQS sources is z 0.17, therefore allowing a more direct comparison 
between the binned BHMF and the BHMF derived using our mixture of Gaussian functions approach. In addition, 
the uncertainties on our estimated BHMF are smallest at z ss 0.17. We are able to place some constraints on the 
local BHMF, despite the fact that the BQS sample only contains 87 sources and has a very shallow flux limit. The 
z ~ 0.2 quasar BHMF appears to fall off as a power law above M B h > 10 8 M Q . Unfortunately, our estimate of the 
local BHMF becomes considerably uncertain below M B h < 10 8 A/ Q , so it is unclear to what degree the power law 
trend continues below this point. In addition, the binned estimate overestimates the BHMF at the high M B h end due 
to the intrinsic uncertainty in the broad line mass estimates, and underestimates the BHMF at the low M BB end due 
to incompleteness, in agreement with our simulations (see § 16. 2[) . 

In Figure [T2l we also co mpare our estimat e of th e BHMF at z = 0.5 with the z = 0.5 BHMF as reported by 
IVestergaard et al.1 (|2008h . IVestergaard et all (|2008f ) estimated the z = 0.5 BHMF by binning estimates of M BH 
derived from the H/3 and Mg II broad emission lines over the redshift range 0.3 < z < 0.68, using the SDSS DR3 
quasar catalogue (Schneide r et al.ll2005f h Despite the differences in approach and survey selection, the two estimates 
of the z = 0.5 BHMF agree fairly well. However, because z — 0.5 defines the upper redshift limit of our BQS sample, 
the uncertainties on the BHMF deriv ed from the BQS quasar s are very lar ge. In addition, incompl eteness in M B u 
likely affects the low M BB bins of the IVestergaard et al. (2008), causing the Vesterg aard et all (|2008f ) z = 0.5 BHMF 
to underestimate the true z = 0.5 BHMF in these bins, a fact reflected by the larger error bars. However, a direct 
comparison between our Bayesian approach and the IVestergaard et alj (|2008h estimate is difficult, due to the different 
redshift ranges used to estimate the BHMF, and the different selection methods of the BQS and the SDSS. 

Although the BQS has a small sample size and probes a narrow range in z, we can attempt to quantify any evolution 
in the local BHMF by comparing the ratio of the comoving number density of quasars at two different values of M B h ■ 
Comparison of the estimated BHMF at z = 0.17 and z — 0.5 suggests a shift in the BHMF toward large M BB . 
In Figure fl3l we show the best fit values of the ratio of <fi(log M BB , z) at M BB = 5 x 1O 8 M0 to </>(log M BB , z) at 
M BB = 5 x 10 9 M© as a function of z, as well as the 68% confidence interval. The logarithm of this ratio gives the 
slope of a power-law between Mbh = 5 x 10 8 M q and Mbh = 5 x 1O 9 M , and therefore allows us to probe evolution 
in the shape of the quasar BHMF at the high M BB end. In general, the ratio is fairly flat, implying no evolution in 
the high M BB slope of the BHMF. However, at z > 0.3 there is marginal evidence for a flattening of the high M BB 
slope of the BHMF. The values of this ratio imply that the BHMF at the high Mbh end falls off as a power-law with 
slope ~ 2, although slopes of ~ 1 and ~ 3 are also consistent with the BQS quasars. 
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Fig. 13. — The ratio of the broad line quasar BHMF at M BH = 5 X 1O 9 M compared to the BHMF at Mbh = 5 x 10 s M , as a function 
of 2 and estimated from the BQS quasars. The dashed line is the posterior median, and the shaded region contains 68% of the probability. 
Assuming that the BHMF is a power-law from Mbh = 5 X 1O 8 M0 to Mbh = 5 X 10 9 Mq, the logarithm of this ratio is the slope of the 
BHMF. The high Mbh BHMF slope appears to be fairly constant for z < 0.3 with a slope of ~ 2, and there is marginal evidence for a 
flattening of the high Mbh slope at z > 0.3. 



In figure [HI we summarize the posterior probability distribution for the parameters governing the distribution of L\ 
at a given Mbh (see Eq.[l9]). Based on the MCMC results, we can constrain the Mb#-ALa(5100A) relationship at 
z < 0.5 to be 

/ . , x 0.92±0.24 

Ai A (5100A) = 5.18lf» x 10 36 (^^J [erg s" 1 ], (70) 

where we have quoted the errors at 95% confidence. The dispersion in L 510 o at a given Mbh is est imated to be 
ai = 0.35^° Jg. Assuming that the bolometric correction is on average C 5 i o ~ 10 (e.g. JKaspi et al.ll2000f) . comparison 
of Equation ([70)1 with Equation (|2"D|) suggests that z < 0.5 broad line AGN have typical Eddington ratios of TEdd ~ 0.4. 
As argued in § I4.2i the distribution in log L\ at a given Mbh is the convolution of the distribution of log TEdd with the 
distribution of logCA. Therefore, the dispersion in L\ at a given Mbh is a combination of the dispersion in Eddington 
ratio and bolometric correction. As a result, we are unable to estimate the dispersion in Eddington ratios at a given 
Mbh from 07. However, if the bolometric correction to £5100 increases with increasing Eddington ratio, as found by 
iVasudevan fc Fabianl (j2007) , or if the bolometric correction is independent of r^^, then the dispersion in TEdd must 
be less than o - ;. Therefore, because we infer that <j\ < 0.5 dex, our results imply that the dispersion in Eddington ratios 
at a given Mbh is < 0.5 dex for z < 0.5 broad line quasar s . These results on the Eddington ratio distribution are 
consistent with previous work (e.g.. iMcLure fc Du nlop 2004; Vestcrgaard 2 0~04t iKollmeier et al.ll200"6l) : however, they 
may be biased because of our assumption of a Gaussian and non-evolving Eddington ratio distribution. In particular, 
if the distribution of Eddington ratios is skewed toward low \0gTEdd1 then we will have underestimated the intrinsic 
dispersion in logr^dd- 

8. SUMMARY 

We have derived the observed data likelihood function which relates the quasar BHMF to the observed distribution 
of redshifts, luminosities, and broad emission line widths. This likelihood function is then used in a Bayesian approach 
to estimating the BHMF, where the BHMF is approximated as a mixture of Gaussian functions. Because much of this 
work was mathematically technical, we summarize the important points here. 

• In this work we describe a flexible parameteric model for the BHMF, where the BHMF is modeled as a mixture 
of Gaussian functions. The distribution of luminosities is modelled as a linear regression of log La as a function 
of log Mbh, where the distribution of log La at a given Mbh was assumed to follow a normal distribution. The 
distribution in line widths at a given La and Mbh is also assumed to have the form of a linear regression, where 
the parameters are based on the most recent broad line mass estimates. Equation (jTSJ) gives the BHMF under 
the mixture of Gaussian function model. 

Equations (|30[) and (|46[) define the likelihood function for broad line mass estimates under the mixture of Gaussian 
functions model if only one emission line at a given z is used to estimate Mbh- Otherwise, if multiple emission 
lines are used for a single quasar, then Equation (|4"Tj) must be used. The posterior is then found by inserting the 
prior distribution and likelihood function into Equations (|15[) and (|16[) . 
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Fig. 14. — Posterior distributions of the parameters for the distribution of luminosities at a given Mbh, as estimated from the z < 0.5 
BQS quasars. The uncertainty on oq and a m is highly correlated. Assuming a bolometric correction of C5100 ~ 10, the values of ag and 
<t 1 imply that the z < 0.5 distribution of broad line quasar Eddington ratios has a mean of T^dd ~ 0.4 and a dispersion of ~ 0.5 dex. 

• Using methods developed for luminosity function estimation (e.g., l/V a -type estimators) without modification 
will lead to errors in black hole mass function estimation, as the black hole mass selection function is not 
equivalent to the flux selection function. In addition, using broad line estimates of Mbh will lead to a broader 
inferred BHMF if one does not correct for the intrinsic uncertainty in the broad line mass estimates. This causes 
one to overestimate 4>(Mbh, z ) hi the tails of the distribution, and underestimate c/)(Mbh, z) near the peak of the 
distribution. However, because low Mbh AGN are more likely to be missed by flux- limited surveys, <J)(Mbh, z) 
will be underestimated at low Mbh due to incompleteness. The end result is a spurious shift in the inferred 
BHMF toward higher Mbh'- incompleteness at low Mbh causes one to miss low Mbh sources while the intrinsic 
statistical uncertainty on the broad line mass estimates causes one to overestimate the number of high Mbh 
black holes. 

• In § 14.51 we modify the likelihood function to include measurement error in the emission line width. We show 
that if the measurement errors on the line width are much smaller than the intrinsic physical dispersion in line 
widths, then measurement error may be neglected. However, if measurement error on the line width is a concern, 
Equations (|5r?)) - (|55|) should be used for Equation (|2U)) instead of Equation ([501) . 

• We describe in § [5] a Metropolis-Hastings algorithm (MHA) for obtaining random draws from the posterior 
distribution of the BHMF under the mixture of Gaussian functions model. These random draws may be used 
to estimate the posterior distribution for the BHMF, as well as to estimate the posterior for any quantities 
calculated from the BHMF. The posterior provides statistically accurate uncertainties on the BHMF and related 
quantities, even below the survey detection limits. We use simulation in § [6] to illustrate the effectiveness of our 
statistical method, as well as to give an example on how to use the MHA output to perform statistical inference. 

• We concluded by applying our method to obtain an estimate of the local unobscured quasar BHMF from the 
z < 0.5 BQS quasar sample. Although there is little information in the BQS quasars on the BHMF at Mbh < 
I0 8 Mq, the mixture of Gaussian functions estimate suggests that the local quasar BHMF falls off approximately 
as a power law with slope ~ 2 for M B h > 1O 8 M at z w 0.2. The local quasar BHMF appears to shift toward 
larger Mbh at higher z, and there is marginal evidence for a flattening of the high mass BHMF slope at z > 0.3. 
We estimate that at a given Mbh, z < 0.5 broad line quasars have a typical Eddington ratio of ~ 0.4 and a 
dispersion in Eddington ratio of < 0.5 dex. However, the estimate of the dispersion in Eddington ratio could 
be biased toward smaller values if the true distribution of Eddington ratios is significantly skewed toward lower 
values. 
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